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® This Invention relates to T7-type DNA polymerases and methods for using them including a method for 
determining the nucleotide base sequence of a DNA molecule, comprising annealing said DNA molecule with a 
primer molecule able to hybridize to said DNA molecule; incubating separate portions of the annealed mixture in 
at least four vessels with four different deoxynucieoside triphosphates, a processive DNA polymerase, wherein 
said polymerase remains bound to said DNA molecule for at least 500 bases before dissociating m an 
environmental condition normally used in the extension reaction of a DNA sequencing reaction, said polymerase 
having less than 500 units of exonuclease activity per mg of said polymerase, and one of four DNA synthesis 
terminating agents which terminate DNA synthesis at a specific nucleotide base. The agent terminates at a 
different specific nucleotide base in each of the four vessels. The DNA products of the incubating reaction are 
separated according to their side so that at least part of the nucleotide base sequence of the DNA molecule can 
be determined. 
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T7 DNA POLYMERASE 

This invention relates to DNA polymerases suitable for DNA sequencing and in particular to a metiiod of 
amplification of a DNA sequence. 

DNA sequencing involves the generation of four populations of single stranded DNA fragments having 
one defined terminus and one variable terminus. The variable terminus always terminates at a specific given 

5 nucleotide base (either guanine (G), adenine (A), thymine (T), or cytosine (C)). The four different sets of 
fragments are each separated on the basis of their length, on a high resolution polyacrylamide gel; each 
band on the gel corresponds colinearly to a specific nucleotide in the DNA sequence, thus identifying the 
positions in the sequence of the given nucleotide base. 

Generally there are two methods of DNA sequencing. One method (Maxam and Gilbert sequencing) 

10 involves the chemical degradation of isolated DNA fragments, each labelled with a single radiolabel at its 
defined terminus, each reaction yielding a limited cleavage specifically at one ore more of the four bases 
(G, A, T or C). The other method (dideoxy sequencing) involves the enzymatic synthesis of a DNA strand. 
Four separate syntheses are run, each reaction being caused to terminate at a specific base (G, A, T or C) 
via incorporation of the appropriate chain terminating dideoxynucleotide. The latter method is preferred 

15 Since the DNA fragments are uniformly labelled (instead of end labelled) and thus the larger DNA fragments 
contain increasingly more radioactivity. Further, ^ss-labeiled nucleotides can be used in place of ^sp- 
labelied nucleotides, resulting in sharper definition: and the reaction products are simple to interpret since 
each lane corresponds only to either G, A. T or C. The enzyme used for most dideoxy sequencing is the 
Escherichia coli DNA-polymerase I large fragment ("Kienow"). Another polymerase used is AMV reverse 

20 transcriptase. 



Summary of the invention 

25 

In one aspect the invention features a method for determining the nucleotide base sequence of a DNA 
molecule, comprising annealing the DNA molecule with a primer molecule able to hybridize to the DNA 
molecule; incubating separate portions of the annealed mixture in at least four vessels with four different 
deoxynucleoside triphosphates, a processive DNA polymerase wherein the polymerase remains bound to a 

30 DNA molecule for at least 500 bases before dissociating in an environmental condition normally used in the 
extension reaction of a DNA sequencing reaction, the polymerase having less than 500 units of exonuclease 
activity per mg of polymerase, and one of four DNA synthesis terminating agents which terminate DNA 
synthesis at a specific nucleotide base. The agent terminates at a different specific nucleotide base in each 
of the four vessels. The DNA products of the incubating reaction are separated according to their size so 

35 that at least a part of the nucleotide base sequence of the DNA molecule can be determined. 

In preferred embodiments the polymerase remains bound to the DNA molecule for at least 1000 bases 
before dissociating; the polymerase is substantially the same as one in cells infected with a T7-type phage 
(i.e., phage in which the DNA polymerase requires host thioredoxin as a subunit; for example, the T7-type 
phage is T7, T3, <#>i, <|)ll, H, W31. gh-l, Y, AII22, or SP6, Studier, 95 Virology 70, 1979); the polymerase is 

40 non-discriminating for dideoxy nucleotide analogs; the polymerase is modified to have less than 50 units of 
exonuclease activity per mg of polymerase, more preferably less than 1 unit, even more preferably less 
than 0.1 unit, and most preferably has no detectable exonuclease activity; the polymerase is able to utilize 
primers of as short as 10 bases or preferably as short as 4 bases; the primer comprises four to forty 
nucleotide bases, and is single stranded DNA or RNA; the annealing step comprises heating the DNA 

45 molecule and the primer to above 65' C, preferably from 65" G to 100° G. and allowing the heated mixture 
to cool to below 65 ' 0. preferably to o' 0 to 30 ' 0; the incubating step comprises a pulse and a chase step, 
wherein the pulse step comprises mixing the annealed mixture with all four different deoxynucleoside 
triphosphates and a processive DNA polymerase, wherein at least one of the deoxynucleoside triphosphates 
is labelled; most preferably the pulse step performed under conditions in which the polymerase does not 

50 exhibit its processivity and is for 30 seconds to 20 minutes at 0* 0 to 20° C or where at least one of the 
nucleotide triphosphates is limiting; and the chase step comprises adding one of the chain terminating 
agents to four separate aliquots of the mixture after the pulse step; preferably the chase step is for 1 to 60 
minutes at 30' C to 50 "C; the terminating agent is a dideoxynucleotide, or a limiting level of one 
deoxynucleoside triphosphate: one of the four deoxynucleotides is dlTP or deazaguanosine; labelled 
primers are used so that no pulse step is required, preferably the label is radioactive or fluorescent; and the 
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polymerase is unable to exhibit its processivity In a second environmental condition normally used In the 
pulse reaction of a DNA sequencing reaction. 

In other aspects the invention features a) a method for producing blunt ended doubie-stranded DNA 
molecules from a linear DNA molecule having no 3 protruding termini, using a processive DNA polymerase 

5 free from exonuclease activity; b) a method of amplification of a DNA sequence comprising annealing a first 
and second primer to opposite strands of a double stranded DNA sequence and Incubating the annealed 
mixture with a processive DNA polymerase having less than 500 units of exonuclease activity per mg of 
polymerase, preferably less than 1 unit, wherein the first and second primers anneal to opposite strands of 
the DNA sequence; in preferred embodiments the primers have their 3 ends directed toward each other; 

10 and the method further comprises, after the incubation step, denaturing the resulting DNA, annealing the 
first and second prinners to the resulting DNA and incubating the annealed mixture with the polymerase; 
preferably the cycle of denaturing, annealing and incubating is repeated from 10 to 40 times; c) a method 
for in vitro mutagenesis of cloned DNA fragments, comprising providing a cloned fragment and synthesizing 
a DNAltrand using a processive DNA polymerase having less than 1 unit of exonuclease activity per mg of 

75 polymerase; d) a method of producing active T7-type DNA polymerase from cloned DNA fragments under 
the control of non-leaky promoters (see below) in the same cell comprising inducing expression of the 
genes only when the cells are in logarithmic growth phase, or stationary phase, and isolating the 
polymerase from the cell; preferably the cloned fragments are under the control of a promoter requiring 17 
RNA polymerase for expression; e) a gene encoding a T7-type DNA polymerase, the gene being 

20 genetically modified to reduce the activity of naturally occurring exonuclease activity: most preferably a 
histidine (His) residue is modified, even more preferably His-123 of gene 5; f) the product of the gene 
encoding genetically modified polymerase; g) a method of purifying T7 DNA polymerase from cells 
comprising a vector from which the polymerase is expressed, comprising the steps of lysing the cells, and 
passing the polymerase over ion-exchange column, over a DE52 DEAE column, a phosphocellulose 

25 column, and a hydroxyapatite column; preferably prior to the passing step the method comprises 
precipitating the polymerase with ammonium sulfate; the method further comprises the step of passing the 
polymerase over a Sephadex DEAE A50 column; and the ion-exchange column is a DE52 DEAE column; h) 
a method of Inactivating exonuclease activity in a DNA polymerase solution comprising incubating the 
solution in a vessel containing oxygen, a reducing agent and a transition metal; i) a kit for DNA sequencing, 

30 comprising a processive DNA polymerase, defined as above, having less than 500 units of exonuclease 
activity per mg of polymerase, wherein the polymerase is able to exhibit its processivity in a first 
environmental condition, and preferably unable to exhibit its processivity in a second environmental 
condition, and a reagent necessary for the sequencing, selected from a chain terminating agent, and dITP; 
j) a' method for labelling the 3' end of a DNA fragment comprising incubating the DNA fragment with a 

35 processive DNA polymerase having less than 500 units of exonuclease activity per mg of polymerase, and 
a labelled deoxynucleotide; k) a method for in vitro mutagenesis of a cloned DNA fragment comprising 
providing a primer and a template, the primer and the template having a specific mismatched base, and 
extending the primer with a processive DNA polymerase; and 1) a method for iin vitro mutagenesis of a 
cloned DNA fragment comprising providing the cloned fragment and synthesizing a DNA strand using a 

40 processive DNA polymerase, having less than 50 units of exonuclease activity, under conditions which 
cause misincorporation of a nucleotide base. 

This Invention provides a DNA polymerase which is processive, non-discriminating, and can utilize short 
primers. Further, the polymerase has no associated exonuclease activity. These are ideal properties for the 
above described methods, and in particular for DNA sequencing reactions, since the background level of 

45 radioactivity in the polyacytamide gels is negligible, there are few or no artifactual bands, and the bands are 
sharp - making the DNA sequence easy to read. Further, such a polymerase allows novel methods of 
sequencing long DNA fragments, as is described in detail below. 

Other features and advantages of the invention will be apparent from the following description of the 
preferred embodiments thereof and from the claims. 

50 

Description of the Preferred Embodiments 



55 The drawings will first briefly be described. 



Drawings 



3 



EP 0 386 857 A2 



Figs. 1-3 are diagrammatic representations of the vectors pTrx-2, mGPl-1, and pGP5-5 respectively; 

Fig, 4 is a graphical representation of the selective oxidation of T7 DNA polymerase; 

Fig, 5 Is a graphical representation of the ability of modified T7 polymerase to synthesize DNA in the 
presence of etheno-dATP; and 
5 Fig. 6 is a diagrammatic representation of the enzymatic amplification of genomic DNA using 

modified T7 DNA polymerase. 

Fig. 7, 8 and 9 are the nucleotide sequences of pTrx-2, a part of pGP5-5 and mGPl-2 respectively. 

Fig. 10 is a diagrammatic representation of pGP5-6. 

10 DNA Polymerase 

In general the DNA polymerase of this invention is processive, has no associated exonuclease activity, 
does not discriminate against nucleotide analog incorporation, and can utilize small oligonucleotides (such 
as tetramers, hexamers and octamers) as specific primers. These properties will now be discussed in detail. 



Processivity 

By processivity is meant that the DNA polymerase is able to continuously incorporate many nucleotides 

20 using the same primer-template without dissociating from the template, under conditions normally used for 
DNA sequencing extension reactions. The degree of processivity varies with different polymerases: some 
incorporate only a few bases before dissociating (eg. Klenow (about 15 bases). T4 DNA polymerase (about 
10 bases), T5 DNA polymerase (about 180 bases) and reverse transcriptase (about 200 bases) (Das et al. J. 
Biol. Chem. 254:1227 1979; Bambara et al., J. Biol. Chem 253:413, 1978) while others, such as those of the 

25 present invention, will remain bound for at least 500 bases and preferably at least 1.000 bases under 
suitable environmental conditions. Such environmental conditions include having adequate supplies of al! 
four deoxynucleoside triphosphates and an incubation temperature from 10° C-SO" C. Processivity is greatly 
enhanced in the presence of E. coli single stranded binding ( ssb ), protein. 

With processive enzyme? termination of a sequencing reaction will occur only at those bases which 

30 have incorporated a chain terminating agent, such as a dideoxynucleotide. If the DNA polymerase is non- 
processive, then artifactual bands will arise during sequencing reactions, at positions corresponding to the 
nucleotide where the polymerase dissociated. Frequent dissociation creates a background of bands at 
incorrect positions and obscures the true DNA sequence. This problem is partially corrected by incubating 
the reaction mixture for a long time (30-60 min) with a high concentration of substrates, which "chase" the 

35 artifactual bands up to a high molecular weight at the top of the gel. away from the region where the DNA 
sequence is read. This is not an ideal solution since a non-processive DNA polymerase has a high 
probability of dissociating from the template at regions of compact secondary structure, or hairpins. 
Reinitiation of primer elongation at these sites is inefficient and the usual result is the formation of bands at 
the same position for all four nucleotides, thus obscuring the DNA sequence. 

40 

Analog discrlmation 

The DNA polymerases of this invention do not discriminate significantly between dideoxy-nucleotide 
45 analogs and normal nucleotides. That is, the chance of incorporation of an analog is approximately the 
same as that of a normal nucleotide or at least incorporates the analog with at least 1/10 the efficiency that 
of a normal analog. The polymerases of this invention also do not discriminate significantly against some 
other analogs. This is important since, in addition to the four normal deoxynucleoside triphosphates (dGTP. 
dATP, dTTP and dCTP), sequencing reactions require the incorporation of other types of nucleotide 
50 derivatives such as; radioactively-or fluorescently-labelled nucleoside triphosphates, usually for labeling the 
synthesized strands with ^^S, ^^P. or other chemical agents. When a DNA polymerase does not discriminate 
against analogs the same probability will exist for the incorporation of an analog as for a norma! nucleotide. 
For labelled nucleoside triphosphates this is important in order to efficiently label the synthesized DNA 
strands using a minimum of radioactivity. Further, lower levels of analogs are required with such enzymes, 
55 making the sequencing reaction cheaper than with a discriminating enzyme. 

Discriminating polymerases show a different extent of discrimination when they are polymerizing in a 
processive mode versus when stalled, struggling to synthesize through a secondary structure impediment. 
At such impediments there will be a variability in the intensity of different radioactive bands on the gel. 
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which may obscure the sequence. 



Exonuclease Activity 

The DNA polymerase of the invention has less than 50%, preferably less than 1%, and most preferably 
less than 0.1%, of the normal or naturally associated level of exonuclease activity (amount of activity per 
polymerase molecule). By normal or naturally associated level is meant the exonuclease activity of 
unmodified T7-type polymerase. Normally the associated activity is about 5,000 units of exonuclease 
activity per mg of polymerase, measured as described below by a modification of the procedure of Chase 
et al. (249 J. Bioi. Chem. 4545, 1974). Exonucleases increase the fidelity of DNA synthesis by excising any 
newly synthesized bases which are incorrectly basepaired to the template. Such associated exonuclease 
activities are detrimental to the quality of DNA sequencing reactions. They raise the minimal required 
concentration of nucleotide precursors which must be added to the reaction since, when the nucleotide 
concentration falls, the polymerase activity slows to a rate comparable with the exonuclease-activity, 
resulting in no net DNA synthesis, or even degradation of the synthesized DNA. 

More importantly, associated exonuclease activity will cause a DNA polymerase to idle at regions in the 
template with secondary structure impediments. When a polymerase approaches such a structure Its rate of 
synthesis decreases as it struggles to pass. An associated exonuclease will excise the newly synthesized 
DNA when the polymerase stalls. As a consequence numerous cycles of synthesis and excision will occur. 
This may result in the polymerase eventually synthesizing past the hairpin (with no detriment to the quality 
of the sequencing reaction); or the polymerase may dissociate from the synthesized strand (resulting in an 
artifactual band at the same position In all four sequencing reactions); or. a chain terminating agent may be 
incorporated at a high frequency and produce a wide variability in the intensity of different fragments in a 
sequencing gel. This happens because the frequency of incorporation of a chain terminating agent at any 
given site increases with the number of opportunities the polymerase has to incorporate the chain 
terminating nucleotide, and so the DNA polymerase will incorporate a chain-terminating agent at a much 
higher frequency at sites of idling than at other sites. 

An ideal sequencing reaction will produce bands of uniform intensity throughout the gel. This is 
essential for obtaining the optimal exposure of the X-ray film for every radioactive fragment. If there is 
variable intensity of radioactive bands, then fainter bands have a chance of going undetected. To obtain 
uniform radioactive intensity of all fragments, the DNA polymerase should spend the same interval of time 
at each position on the DNA, showing no preference for either the additon or removal of nucleotides at any 
given site. This occurs if the DNA polymerase lacks any associated exonuclease, so that it will have only 
one opportunity to Incorporate a chain terminating nucleotide at each position along the template. 



Short primers 

The DNA polymerase of the invention is able to utilize primers of 10 bases or less, as well as longer 
ones, most preferably of 4-20 bases. The ability to utilize short primers offers a number of important 
advantages to DNA sequencing. The shorter primers are cheaper to buy and easier to synthesize than the 
usual 15-20-mer primers. They also anneal faster to complementary sites on a DNA template, thus making 
the sequencing reaction faster. Further, the ability to utilize small (e.g.. six or seven base) oligonucleotide 
primers for DNA sequencing permits strategies not otherwise possible for sequencing long DNA fragments. 
For example, a kit containing 80 random hexamers could be generated, none of which are complementary 
to any sites in the cloning vector. Statistically, one of the 80 hexamer sequences will occur an average of 
every 50 bases along the DNA fragment to be sequenced. The determination of a sequence of 3000 bases 
would require only five sequencing cycles. First, a "universal" primer (e.g.. New England Biolabs #1211, 
sequence 5 GTAAAACGACGGCCAGT 3') would be used to sequence about 600 bases at one end of the 
insert. Using the results from this sequencing reaction, a new primer would be picked from the kit 
homologous to a region near the end of the determined sequence. In the second cycle, the sequence of the 
next 600 bases would be determined using this primer. Repetition of this process five times would 
determine the complete sequence of the 3000 bases, without necessitating any subcioning, and without the 
chemical synthesis of any new oligonucleotide primers. The use of such short primers may be enhanced by 
including gene 2.5 and 4 protein of T7 in the sequencing reaction. 

DNA polymerases of this invention, (i.e., having the above properties) include modified T7-type 
polymerases. That is the DNA polymerase requires host thioredoxin as a sub-unit, and they are substan- 
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tially identical to a modified T7 DNA polymerase or to equivalent enzymes isolated from related phage, 
such as T3, ^1, <^»ll, H, W31, gh-l, Y, AII22 and SP6. Each of these enzymes can be modified to have 
properties similar to those of the modified T7 enzyme. It Is possible to isolate the enzyme from phage 
infected cells directly, but preferably the enzyme is Isolated from cells which overproduce it. By substan- 
5 tially Identical is meant that the enzyme may have amino acid substitutions which do not affect the overall 
properties of the enzyme. One example of a particularly desirable amino acid substitution is one In which 
the natural enzyme is modified to remove any exonuclease activity. This modification may be performed at 
the genetic or chemical level (see below). 

70 

Cloning T7 polymerase 

As an example of the invention we shall describe the cloning, overproduction, purification, modification 
and use of T7 DNA polymerase. This processive enzyme consists of two polypeptides tightly complexed in 

75 a one to one stoichiometry. One is the phage T7-encoded gene 5 protein of 84,000 daltons (Modrich et al. 
150 J. Biol. Chem. 5515, 1975), the other is the E. coll encoded thioredoxin, of 12,000 daltons (Tabor et ai., 
J. Biol, Chem. 262:16, 216, 1987). The thioredoxTn is an accessory protein and attaches the gene 5 protein 
(the non-processive actual DNA polymerase) to the primer template. The natural DNA polymerase has a 
very active 3' to 5' exonuclease associated with It. This activity makes the polymerase useless for DNA 

20 sequencing and must be inactivated or modified before the polymerase can be used. This is readily 
performed, as described below, either chemically, by local oxidation of the exonuclease domain, or 
genetically, by modifying the coding region of the polymerase gene encoding this activity. 



25 pTrx'2 

In order to clone the trxA (thioredoxin) gene of E. coli wild type E. coli DNA was partially cleaved with 
Sau3A and the fragmentsTgated to Bam HI-cleaved T7 DNA isolated from strain T7 ST9 (Tabor et al., in 
Thioredoxin and Giutaredoxin Systems: Structure and Function (Holmgren et al., eds) pp. 285-300, Raven 

30 Press. NY; anTTabor et ai., supra ). The ligated DNA was transfected into E. coli to<A" cells, the mixture 
plated onto trxA" cells, and th"e7esulting T7 plaques picked. Since T7 cannot grow without an active E. coli 
trxA gene oTTy those phages containing the trxA gene could form plaques. The cloned trxA genes were 
located on a 470 base pair Hindi fragment. 

In order to overproduce thioreodoxin a plasmid, pTrx-2, was as constructed. Briefly, the 470 base pair 

35 Hindi fragment containing the trxA gene was isolated by standard procedure (Maniatis et al.. Cloning: A 
Laboratory Manual, Cold Spring Harbor Labs., Cold Spring Harbor, N.Y.). and ligated to a derivative of 
pBR322 containing a Ptac promoter (ptac-12, Amann et al., 25 Gene 167, 1983). Referring to Fig. 2, ptac- 
12, containing y3-lactamase and Col El origin, was cut with Pvull, to yield a fragment of 2290 bp, which was 
then ligated to two tandem copies of trxA (Hindi fragment) using commercially available linkers ( Sma l- 

40 BamHI polylinker). to form pTrx-2. The complete nucleotide sequence of pTrx-2 is shown in Figure 7. 
Thioredoxin production is now under the control of the tac promoter, and thus can be specifically induced, 
e.g. by IPTG (isopropyl jS-D-thiogalactoside). 

45 pGP5-5 and mGP1-2 

Some gene products of T7 are lethal when expressed in E. coli. An expression system was developed 
to facilitate cloning and expression of, lethal genes, based on the inducible expression of T7 RNA 
polymerase. Gene 5 protein is lethal in some E. coli strains and an example of such a system is described 
50 by Tabor at al. 82 Proc. Nat. Acad. Sci. 1074 (1985) where T7 gene 5 was placed under the control of the 
010 promoter, and is only expressed when T7 RNA polymerase is present in the ceil. 

Briefly, pGP5-5 (Fig. 3) was constructed by standard procedures using synthetic Bam HI linkers to join 
T7 fragment from 14306 (Ndel) to 16869 (Ahalll), containing gene 5, to the 560 bp fragment of T7 from 
5667 (Hindi) to 6166 (Fnu^FTT) containing both the ^1.1 A and <#>1.1B promoters, which are recognized by 
55 T7 RNA^olymerase andThe 3kb BamHI-Hincll fragment of pACYC177 (Chang et aL. 134 J. BacterioL 1 141 , 
1978). The nucleotide sequence of the T7 inserts and linkers in shown in Fig. 8. In this plasmid gene 5 is 
only expressed when T7 RNA polymerase is provided in the cell. 

Referring to Fig. 3, T7 RNA polymerase is provided on phage vector mGP1-2. This is similar to pGPl-2 
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(Tabor et al.. id.) except that the fragment of T7 from 3133 (Haelll) to 5840 (Hinfl). containing 17 RNA 
polymerase was iigated, using linkers (Bglll and Sail respectively), to BamHI-Sall cut M13 mpB. placing the 
polymerase gene under control of the lac promoter. The complete nucleotide sequence of mGP1-2 Is 
shown in Fig. 9. 

6 Since PGP5-5 and pTrx-2 have different origins of replication (respectively a P15A and a C0IEI origin) 
they can be tranformed into one cell simultaneously. pTrx-2 expresses large quantities of thioredoxin in the 
presence of IPTG. meP1-2 can coexist in the same cell as these two plasmids and be used to regulate 
expression of T7-DNA polymerase from pGP5-5, simply by causing production of T7-RNA polymerase by 
inducing the lac promoter with, e.g., IPTG, 

70 

Overproduction of T7 DNA polymerase 

There are several potential strategies for overproducing and reconstituting the two gene products of 

15 trxA and gene 5. The same cell strains and plasmids can be utilized for all the strategies. In the preferred 
strategy the two genes are co-overexpressed in the same cell. (This is because gene 5 is susceptible to 
proteases until thioredoxin is bound to it) As described in detail below, one procedure is to place the two 
genes separately on each of two compatible plasmids In the same cell. Alternatively, the two genes couid 
be placed in tandem on the same piasmid. It is important that the T7-gene 5 is placed under the control of 

20 a non-leaky inducible promoter, such as <#>1.1A, <^>1.1B and <#>10 of T7, as the synthesis of even small 
quantities of the two polypeptides together is toxic in most E. coli cells. By non-leaky is meant that less 
than 500 molecules of the gene product are produced, per ceil generation time, from the gene when the 
promoter, controlling the gene's expression, is not activated. Preferably the T7 RNA polymerase expression 
system is used although other expression systems which utilize inducible promoters could also be used. A 

25 leaky promoter, e.g., plac, allows more than 500 molecules of protein to be synthesized, even when not 
induced, thus cells containing lethal genes under the control of such a promoter grow poorly and are not 
suitable in this invention. It is of course possible to produce these products in cells where they are not 
lethal, for example, the plac promoter is suitable in such cells. 

In a second strategy each gene can be cloned and overexpressed separately. Using this strategy, the 

30 cells containing the individually overproduced polypeptides are combined prior to preparing the extracts, at 
which point the two polypeptides form an active T7 DNA polymerase. 



Example 1 : Production of T7 DNA polymerase 

E. coli strain 71.18 (Messing et al.. Proc. Nat. Acad. Sci. 74:3642, 1977) is used for preparing stocks of 
mGP1-2771.18 is stored in 50% glycerol at -80° C. and is streaked on a standard minimal media agar plate. 
A single colony is grown overnight in 25 ml standard M9 media at 37* C, and a single plaque of mQP1-2 is 
obtained by titering the stock using freshly prepared 71.18 cells. The plaque is used to inoculate 10 ml 2X 

40 LB (2% Bacto-Tryptone, 1% yeast extract 0.5% NaCI, 8mM NaOH) containing JM103 grown to an 
A59o=0.5. This culture will provide the phage stock for preparing a large culture of mGP1-2. After 3-12 
hours, the 10 ml culture is centrifuged, and the supernatant used to infect the large (2L) culture. For the 
large culture, 4 X 500 ml 2X LB is inoculated with 4 X 5 ml 71 .18 cells grown in M9, and is shaken at 37 C. 
When the large culture of cells has grown to an A530 = 1.0 (approximately three hours), they are inoculated 

45 with 10 ml of supernatant containing the starter lysate of mGP1-2. The infected cells are then grown 
overnight at 37 ' C. The next day, the cells are removed by centrifugation, and the supernatant is ready to 
use for induction of K38/pGP5-5/pTrx-2 (see below). The supernatant can be stored at 4''C for approxi- 
mately six months, at a titer -5X10^^ <#>/ml. At this titer, 1 L of phage will Infect 12 liters of cells at an 
A590 = 5 with a multiplicity of infection of 15. If the titer is low. the mGP1-2 phage can be concentrated from 

50 the supernatant by dissolving NaCI (60 gm/liter) and PEG-6000 (65 gm/liter) in the supernatant allowing the 
mixture to settle at 0* C for 1-72 hours, and then centrifuging (7000 rpm for 20 min). The precipitate, which 
contains the mGP1-2 phage, is resuspended in approximately 1/20th of the original volume of M9 media. 

K38/pGP5-5/pTrx-2 is the E. coli strain (genotype Hfrc (X)) containing the two compatible plasmids 
pGP5-5 and pTrx-2. pGP5-5 plasrniJhas a P15A origin of replication and expresses the kanamycin (Km) 

55 resistance gene. pTrx-2 has a ColEI origin of replication and expresses the ampicillin (Ap) resistance gene. 
The plasmids are Introduced into K38 by standard procedures, selecting Km"^ and Ap" respectively. The 
ceils K38/pGP5-5/pTrx-2 are stored in 50% glycerol at -80* C. Prior to use they are streaked on a plate 
containing 50ug/ml ampicillin and kanamycin, grown at 37* C overnight and a single colony grown in 10 ml 
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LB media containing SOng/nni ampiciilin and kanannycin, at 37' C for 4-6 hours. The 10 ml ceil culturejs 
used to inoculate 500 ml of LB media containing 50ug/ml ampiciilin and kanamycin and shaken at 37 C 
overnight. The following day, the 500 ml culture is used to inoculate 1 1 liters of 2X LB-KPO+ media (2% 
Bacto-Tryptone, 1% yeast extract, 0.5% NaCI, 20 mM KPO^. 0.2% dextrose, and 0.2% casamino acids, pH 

5 7.4), and grown with aeration in a fermentor at 37' C. When the cells reach an A590 = 5.0 (i.e. logarithmic or 
stationary phase cells), they are infected with mGP1-2 at a multiplicity of infection of 10, and IPTG is added 
(final concentration 0.5mM). The IPTG Induces production of thioredoxin and the T7 RNA polymerase in 
mGP1-2, and thence induces production of the cloned DNA polymerase. The cells are grown for an 
additional 2.5 hours with stirring and aeration, and then harvested. The cell pellet is resuspended in 1.5 L 

70 10% sucrose/20 mM Tris-HCl, pH 8.0/25 mM EDTA and re-spun. Finally, the cell pellet is resuspended in 
200 ml 10% sucrose/20 mM Tris-HGl, pH 8/1.0 mM EDTA, and frozen in liquid N2. From 12 liters of 
induced ceils 70 gm of cell paste are obtained containing approximately 700 mg gene 5 protein and 100 
mg thioredoxin. 

K38/pTrx-2 (K38 containing pTrx-2 alone) overproduces thioredoxin, and tt is added as a "booster" to 

15 extracts of K38/pGP5-5/pTrx-2 to insure that thioredoxin is in excess over gene 5 protein at the outset of the 
purification. The K38/pTrx-2 cells are stored in 50% glycerol at -80° C. Prior to use they are streaked on a 
plate containing 50 ng/ml ampiciilin, grown at 37 'c for 24 hours, and a single colony grown at 37' C 
overnight in 25 ml LB media containing 50 ug/ml ampiciilin. The 25 ml culture is used to inoculate 2 L of 2X 
LB media and shaken at 37" C. When the cells reach an A590 =3.0, the ptac promoter, and thus thioredoxin 

20 production, is induced by the addition of IPTG (final concentration 0.5 mM). The cells are grown with 
shaking for an additional 12-16 hours at 37 ' G, harvested, resuspended in 600 ml 10% sucrose/20 mM Tris- 
HCi, pH 8.0/25 mM EDTA, and re-spun. Finally, the cells are resuspended in 40 ml 10% sucrose/20 mM 
Tris-HCl, pH 8/0.5 mM EDTA, and frozen in liquid N2. From 2L of cells 16 gm of cell paste are obtained 
containing 150 mg of thioredoxin. 

25 Assays for the polymerase involve the use of single-stranded calf thymus DNA (6m M) as a substrate. 
This is prepared immediately prior to use by denaturation of double-stranded calf thymus DNA with 50 mM 
NaOH at 20" 0 for 15 min,, followed by neutralization with HCl. Any purified DNA can be used as a 
template for the polymerase assay, although preferably it will have a length greater than 1,000 bases. 

The standard T7 DNA polymerase assay used Is a modification of the procedure described by Grippo 

30 et al. (246 J. Biol, Ghem. 6867, 1971). The standard reaction mix (200 ul final volume) contains 40 mM 
Tris/HCI pH 7.5, 10 mM MgCb, 5 mM dithiothreitol, 100 nmol alkali-denatured calf thymus DNA, 0.3 mM 
dGTP, dATP, dCTP and pH]dTTP (20 cpm/pm), 50 ug/ml BSA, and varying amounts of T7 DNA 
polymerase. Incubation is at 37"C (10° C-45"C) for 30 min (5 min-60 min). The reaction is stopped by the 
addition of 3 ml of cold (0"C) 1 N HCl-0.1 M pyrophosphate. Acid-insoluble radioactivity is determined by 

35 the procedure of Hinkle et al. (250 J. Biol. Chem. 5523, 1974). The DNA is precipitated on ice for 15 min (5 
min-12 hr), then precipitated onto glass-fiber filters by filtration. The filters are washed five times with 4 mi 
of cold (0"C) 0.1 M HGI-0.1M pyrophosphate, and twice with cold (O^C) 90% ethanol. After drying, the 
radioactivity on the filters is counted using a non-aqueous scintillation fluor. 

One unit of polymerase activity catalyzes the incorporation of 10 nmol of total nucleotide into an acid- 

40 soluble form in 30 min at 37" C, under the conditions given above. Native T7 DNA polymerase and modified 
T7 DNA polymerase (see below) have the same specific polymerase activity ± 20%, which ranges between 
5,000-20,0(jO units/mg for native and 5,000-50,000 units/mg for modified polymerase) depending upon the 
preparation, using the standard assay conditions stated above, 

T7 DNA polymerase is purified from the above extracts by precipitation and chromatography tech- 

45 niques. An example of such a purification follows. 

An extract of frozen cells (200 ml K38/pGP5-5/pTrx-2 and 40 ml K38/pTrx-2) are thawed at 0"C 
overnight. The ceils are combined, and 5 ml of lysozyme (15 mg/ml) and 10 ml of NaCI (5M) are^ added. 
After 45 min at 0° C, the cells are placed in a 37* C water bath until their temperature reaches 20* C. The 
cells are then frozen in liquid N2. An additional 50 ml of NaC! (5M) is added, and the cells are thawed in a 

50 37" C water bath. After thawing, the cells are gently mixed at 0° C for 60 min. The iysate is centrifuged for 
one hr at 35,000 rpm in a Beckman 45Ti rotor. The supernatant (250 ml) is fraction I. It contains 
approximately 700 mg gene 5 protein and 250 mg of thioredoxin (a 2:1 ratio thioredoxin to gene 5 protein). 

90 gm of ammonium sulphate is dissolved in fraction I (250 ml) and stirred for 60 min. The suspension 
is allowed to sit for 60 min, and the resulting precipitate collected by centrifugation at 8000 rpm for 60 min. 

55 The precipitate is redissolved in 300 ml of 20 mM Tris-HCl pH 7.5/5 mM 2-mercaptoethanol/0.1 mM 
EDTA/10% glycerol (Buffer A). This is fraction II. 

A column of Whatman DE52 DEAE (12.6 cm^ x 18 cm) is prepared and washed with Buffer A. Fraction 
II Is dialyzed overnight against two changes of 1 L of Buffer A each until the conductivity of Fraction II has a 
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conductivity equal to that of Buffer A containing 100 mM NaCL Dlalyzed Fraction II is applied to the column 
at a flow rate of 100 nnl/hr, and washed with 400 ml of Buffer A containing 100 mM NaCI. Proteins are 
eluted with a 3.5 L gradient from 100 to 400 mM NaCI in Buffer A at a flow rate of 60 ml/hr. Fractions 
containing T7 DNA polymerase, which elutes at 200 mM NaCI, are pooled. This is fraction III (190 ml). 

5 A column of Whatman P11 phosphocellulose (12.6 cm^ x 12 cm) is prepared and washed with 20 mM 
KPO4 pH 7.4/5 mM 2-mercaptoethano 1/0.1 EDTA/10 % glycerol (Buffer B). Fraction III is diluted 2-fold (380 
ml) with Buffer B, then applied to the column at a flow rate of 60 ml/hr, and washed with 200 ml of Buffer B 
containing lOOmM KCI. Proteins are eluted with a 1.8 L gradient from 100 to 400 mM KCl in Buffer B at a 
flow rate of 60 ml/hr. Fractions containing T7 DNA polymerase, which elutes at 300 mM KCI, are pooled. 

70 This is fraction IV (370 ml). 

A column of DEAE-Sephadex A-50 (4.9 cm^ x 15 cm) is prepared and washed with 20 mM Tris-HCI 
7.0/0.1 mM dithiothreitol/0.1 mM EDTA/10% glycerol (Buffer 0). Fraction IV is dialyzed against two changes 
of 1 L Buffer C to a final conductivity equal to that of Buffer C containing 100 mM NaCI. Dialyzed fraction IV 
is applied to the column at a flow rate of 40 ml/hr, and washed with 150 ml of Buffer C containing 100 mM 

75 NaCI. Proteins are eluted with a 1 L gradient from 100 to 300 mM NaCI in Buffer 6 at a flow rate of 40 
ml/hr. Fractions containing T7 DNA polymerase, which elutes at 210 M NaCI, are pooled. This is fraction V 
(120 ml). 

A column of BioRad HTP hydroxylapatite (4.9 cm^ x 15 cm) is prepared and washed with 20 mM KPO4, 
pH 7.4/10 mM 2-mercaptoethano 1/2 mM Na citrate/10% glycerol (Buffer D). Fraction V is dialyzed against 
20 two changes of 500 ml Buffer D each. Dialyzed fraction V is applied to the column at a flow rate of 30 ml/hr, 
and washed with 100 ml of Buffer D, Proteins are eluted with a 900 ml gradient from 0 to 180 mM KPO+, 
pH 7.4 in Buffer D at a flow rate of 30 ml/hr. Fractions containing T7 DNA polymerase, which elutes at 50 
mM KPO^, are pooled. This is fraction VI (130 ml). It contains 270 mg of homogeneous T7 DNA 
polymerase. 

25 Fraction VI is dialyzed versus 20 mM KPO4 pH 7.4/0.1 mM dithiothreitol/0.1 mM EDTA/50% glycerol. 
This is concentrated fraction VI (-65 ml, 4 mg/ml). and is stored at -20 " C. 

The isolated T7 polymerase has exonuclease activity associated with it. As stated above this must be 
inactivated. An example of inactlvatlon by chemical modification follows. 

Concentrated fraction VI is dialyzed overnight against 20 mM KPO^ pH 7.4/0.1 mM dithiothreitol/10% 
30 glycerol to remove the EDTA present in the storage buffer. After dialysis, the concentration is adjusted to 2 
mg/ml with 20 mM KPO4 pH 7.4/0.1 mM dithiothreitol/10% glycerol, and 30 ml (2mg/ml) aliquots are placed 
in 50 ml polypropylene tubes. (At 2 mg/ml, the moiar concentration of T7 DNA polymerase is 22 uM.) 

Dithiothreitol (DTT) and ferrous ammonium sulfate (Fe(NH02(SO026H2 0) are prepared fresh imme- 
diately before use, and added to a 30 ml aliquot of T7 DNA polymerase, to concentrations of 5 mM DTT 
35 (0.6 ml of a 250 mM stock) and 20jliM Fe(NH4)2(S04)26H20 (0.6 ml of a 1 mM stock). During modification 
the molar concentrations of T7 DNA polymerase and iron are each approximately 20 uM while DTT is in 
250X molar excess. 

The modification is carried out at O'C under a saturated oxygen atmosphere as follows. The reaction 
mixture is placed on ice within a dessicator, the dessicator Is purged of air by evacuation and subsequently 

40 filled with 100% oxygen. This cycle is repeated three times. The reaction can be performed in air (20% 
oxygen), but occurs at one third the rate. 

The time course of loss of exonuclease activity Is shown in Fig. 4. ^H-labeled double-stranded DNA (6 
cpm/pmol) was prepared from bacteriophage T7 as described by Richardson (15 J. Molec. Biol. 49, 1966). 
3H-labe!ed single-stranded T7 DNA was prepared immediately prior to use by denaturation of double- 

45 stranded ^H-labeled T7 DNA with 50 mM NaOH at 20*C for 15 min. followed by neutralization with HOI. 
The standard exonuclease assay used is a modification of the procedure described by Chase et al, ( supra ). 
The standard reaction mixture (100 ui final volume) contained 40 mM Tris/HCl pH 7.5, 10 mM MgCb, 10 
mM dithiothreitol. 60 nmol ^H-labeled single-stranded T7 DNA (6 cpm/pm), and varying amounts of T7 DNA 
polymerase. ^H-labeled double-stranded T7 DNA can also be used as a substrate. Also, any uniformly 

50 radioactively labeled DNA, single- or double-stranded, can be used for the assay. Also, 3 end labeled 
single-or double-stranded DNA can be used for the assay. After incubation at 37' C for 15 min, the reaction 
is stopped by the addition of 30 ill of BSA (10mg/ml) and 25 ill of TCA (100% w/v). The assay can be run 
at 10* C-45* C for 1-60 min. The DNA is precipitated on ice for 15 min (1 min - 12 hr), then centrifuged at 
12,000 g for 30 min (5 min - 3 hr). 100 ill! of the supernatant is used to determine the acid-soluble 

55 radioactivity by adding it to 400 ul water and 5 ml of aqueous scintillation cocktail.. 

One unit of exonuclease activity catalyzes the acid solubilization of 10 nmol of total nucleotide in 30 min 
under the conditions of the assay. Native T7 DNA polymerase has a specific exonuclease activity of 5000 
units/mg, using the standard assay conditions stated above. The specific exonuclease activity of the 
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modified T7 DNA polymerase depends upon the extent of chemical modification, but ideally is at least 10- 
100-foid lower than that of native T7 DNA polymerase, or 500 to 50 or less units/mg using the standard 
assay conditions stated above. When double stranded substrate is used the exonuclease activity is about 7- 
fold higher. 

5 Under the conditions outlined, the exonuclease activity decays exponentially, with a half-life of decay of 
eight hours. Once per day the reaction vessel is mixed to distribute the soluble oxygen, otherwise the 
reaction will proceed more rapidly at the surface where the concentration of oxygen is higher. Once per day 
2,5 mM DTT (0,3 ml of a fresh 250 mM stock to a 30 ml reaction) is added to replenish the oxidized DTT. 
After eight hours, the exonuclease activity of T7 DNA polymerase has been reduced 50%, with 

10 negligible loss of polymerase activity. The 50% loss may be the result of the complete inactivation of 
exonuclease activity of half the polymerase molecules, rather than a general reduction of the rate of 
exonuclease activity in all the molecules. Thus, after an eight hour reaction all the molecules have normal 
polymerase activity, half the molecules have normal exonuclease activity, while the other half have <0.1% 
of their original exonuclease activity. 

15 When 50% of the molecules are modified (an eight hour reaction), the enzyme is suitable, although 
suboptimal, for DNA sequencing. For more optimum quality of DNA sequencing, the reaction is allowed to 
proceed to greater than 99% modification (having less than 50 units of exonuclease activity), which requires 
four days. 

After four days, the reaction mixture is dialyzed against 2 changes of 250 ml of 20 mM KPO4 pH 7.4/0.1 
20 mM dithiothreltol/0.1 mM EDTA/50% glycerol to remove the iron. The modified T7 DNA polymerase (--4 
mg/mi) is stored at -20 ' G. 

The reaction mechanism for chemical modification of T7 DNA polymerase depends upon reactive 
oxygen species generated by the presence of reduced transition metals such as Fe^ and oxygen. A 
possible reaction mechanism for the generation of hydroxyl radicals is outlined below: 
25 (1) Fe2* + O2 - Fe3* + O2 

(2) 2 O2 + 2 H* H2O2 + O2 

(3) Fe^* + H2O2 - Fe3* + OH- + OH" 

In equation 1. oxidation of the reduced metal ion yields superoxide radical. O2 . The superoxide 
radical can undergo a dismutation reaction, producing hydrogen peroxide (equation 2). Finally, hydrogen 
30 peroxide can react with reduced metal ions to form hydroxy! radicals, OH- (the Fenton reaction, equation 3). 
The oxidized metal ion is recycled to the reduced form by reducing agents such as dlthiothreitoi (DTT). 

These reactive oxygen species probably inactivate proteins by irreversibly chemically altering specific 
amino acid residues. Such damage is observed in SDS-PAGE of fragments of gene 5 produced by ONBr or 
trypsin. Some fragments disappear, high molecular weight cross linking occurs, and some fragments are 
35 broken into two smaller fragments. 

As previously mentioned, oxygen, a reducing agent (e.g. DTT, 2-mercaptoethanoi) and a transition 
metal (e.g. iron) are essential elements of the modification reaction. The reaction occurs in air, but is 
stimulated three-fold by use of 100% oxygen. The reaction will occur slowly in the absence of added 
transition metals due to the presence of trace quantities of transition metals (1-2/j.M) in most buffer 
40 preparations. 

As expected, inhibitors of the modification reaction include anaerobic conditions (e.g., N2) and metal 
chelators (e.g. EDTA, citrate, nitrilotriacetate). In addition, the enzymes catalase and superoxide dismutase 
may inhibit the reaction, consistent with the essential role of reactive oxygen species in the generation of 
modified T7 DNA polymerase. 

45 As an alternative procedure, it is possible to genetically mutate the T7 gene 5 to specifically inactivate 
the exonuclease domain of the protein. The T7 gene 5 protein purified from such mutants is ideal for use in 
DNA sequencing without the need to chemically inactivate the exonuclease by oxidation and without the 
secondary damage that inevitably occurs to the protein during chemical modification. 

Genetically modified T7 DNA polymerase can be isolated by randomly mutagenizing the gene 5 and 

50 then screening for those mutants that have lost exonuclease activity, without loss of polymerase activity. 
Mutagenesis is performed as follows. Single-stranded DNA containing gene 5 (e.g., cloned in pEMBL-8, a 
plasmid containing an origin for single stranded DNA replication) under the control of a T7 RNA polymerase 
promoter is prepared by standard procedure, and treated with two different chemical mutagens: hydrazine, 
which will mutate C's and T's. and formic acid, which will mutate G's and As. Myers et al. 229 Science 242, 

55 1985. The DNA is mutagenized at a dose which results in an average of one base being altered per plasmid 
molecule. The single-stranded mutagenized plasmids are then primed with a universal 17-mer primer (see 
above), and used as templates to synthesize the opposite strands. The synthesized strands contain 
randomly incorporated bases at positions corresponding to the mutated bases in the templates. The double- 
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stranded mutagenized DNA is then used to transform the strain K38/pGP1-2, which is strain K38 containing 
the plasmid pGP1-2 (Tabor et al., supra ). Upon heat induction this strain expresses T7 RNA polymerase. 
The transformed cells are plated at 30 C, with approximately 200 colonies per plate. 

Screening for cells having T7 DNA polymerase lacking exonuclease activity is based upon the following 

5 finding. The 3' to 5 exonuclease of DNA polymerases serves a proofreading function. When bases are 
misincorporated, the exonuclease will remove the newly incorporated base which is recognized as 
"abnormal". This is the case for the analog of dATP, etheno-dATP. which is readily incorporated by T7 
DNA polymerase in place of dATP. However, in the presence of the 3 to 5 exonuclease of T7 DNA 
polymerase it is excised as rapidly as it is incorporated, resulting in no net DNA synthesis. As shown in 

70 figure 6, using the alternating copolymer poly d(AT) as a template, native T7 DNA polymerase catalyzes 
extensive DNA synthesis only in the presence of dATP, and not etheno-dATP. in contrast, modified T7 DNA 
polymerase, because of its lack of an associated exonuclease. stably incorporates etheno-dATP Into DNA at 
a rate comparable to dATP. Thus, using poly d(AT) as a template, and dTTP and etheno-dATP as 
precursors, native T7 DNA polymerase is unable to synthesize DNA from this template, while T7 DNA 

15 polymerase which has lost its exonuclease activity will be able to use this template to synthesize DNA. 

The procedure for lysing and screening large number of colonies is described In Raetz (72 Proc. Nat. 
Acad. Sci. 2274, 1975). Briefly, the K38/pGP1-2 cells transformed with the mutagenized gene 5-containing 
plasmids are transferred from the petri dish, where they are present at approximately 200^ colonies per 
plate, to a piece of filter paper ("replica plating"). The filter paper discs are then placed at 42" C for 60 min 

20 to induce the T7 RNA polymerase, which in turn expresses the gene 5 protein. Thioredoxin is constitutively 
produced from the chromosomal gene. Lysozyme is added to the filter paper to iyse the ceils. After a 
freeze thaw step to ensure cell lysis, the filter paper discs are incubated with poly d(AT), [a^apjclTTP and 
etheno-dATP at 37" G for 60 min. The filter paper discs are then washed with acid to remove the 
unincorporated pP]dATP. DNA will precipitate on the filter paper in acid, while nucleotides will be soluble. 

25 The washed filter paper is then used to expose X-ray film. Colonies which have induced an active T7 DNA 
polymerase which is deficient in its exonuclease will have incorporated acid-insoluble 32p, and will be visible 
by autoradiography. Colonies expressing native T7 DNA polymerase, or expressing a T7 DNA polymerase 
defective in polymerase activity, will not appear on the autoradiograph. 

Colonies which appear positive are recovered from the master petri dish containing the original 

30 colonies. Cells containing each potential positive clone will be induced on a larger scale (one liter) and T7 
DNA polymerase purified from each preparation to ascertain the levels of exonuclease associated with each 
mutant. Those low in exonuclease are appropriate for DNA sequencing. 

Directed mutagenesis may also be used to isolate genetic mutants in the exonuclease domain of the T7 
gene 5 protein. The following is an example of this procedure. 

35 T7 DNA polymerase with reduced exonuclease activity (modified T7 DNA polymerase) can also be 
distinguished from native T7 DNA polymerase by its ability to synthesize through regions of secondary 
structure. Thus, with modified DNA polymerase. DNA synthesis from a labeled primer on a template having 
secondary structure will result in significantly longer extensions, compared to unmodified or native DNA 
polymerase. This assay provides a basis for screening for the conversion of small percentages of DNA 

40 polymerase molecules to a modified form. 

The above assay was used to screen for altered T7 DNA polymerase after treatment with a number of 
chemical reagents. Three reactions resulted in conversion of the enzyme to a modified form. The first is 
treatment with iron and a reducing agent, as described above. The other two involve treatment of the 
enzyme with photooxidizing dyes, Rose Bengal and methylene blue, in the presence of light. The dyes 

45 must be titrated carefully, and even under optimum conditions the specificity of inactivation of exonuclease 
activity over polymerase activity is low, compared to the high specificity of the iron-induced oxidation. Since 
these dyes are quite specific for modification of histidine residues, this result strongly implicates histidine 
residues as an essential species in the exonuclease active site. 

There are 23 histidine residues in T7 gene 5 protein. Eight of these residues lie in the amino half of the 

50 protein, in the region where, based on the homology with the large fragment of E. coli DNA polymerase I, 
the exonuclease domain may be located (OIlis et al. Nature 313, 818. 1984). As described below, seven of 
the eight histidine residues were mutated individually by synthesis of appropriate oligonucleotides, which 
were then incorporated into gene 5. These correspond to mutants 1, and 6-10 in table 1. 

The mutations were constructed by first cloning the T7 gene 5 from PGP5-3 (Tabor et al., J. Biol. 

55 Chem. 282. 1987) into the Smal and Hindill sites of the vector Ml 3 mp18, to. give mGP5-2. (The vector 
used and the source of gene 5 are not critical in this procedure.) Single-stranded mGP5-2 DNA was 
prepared from a strain that incorporates deoxyuracil in place of deoxythymidine (Kunkel, Proc. Natl. Acad. 
Sci. USA 82, 488, 1985). This procedure provides a strong selection for survival of only the synthesized 
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strand (that containing the mutation) when transfected into wild-type E.coli, since the strand containing uracil 
wlli be preferentially degraded. 

Mutant oligonucleotides. 15-20 bases in length, were synthesized by standard procedures. Each 
oligonucleotide was annealed to the template, extended using native T7 DNA polymerase and ligated using 
5 T4 DNA ligase. Covalently closed circular molecules were isolated by agarose gel electrophoresis, run in 
the presence of 0.5ug/ml ethidium bromide. The resulting purified molecules were then used to transform 
E. coli 71.18. DNA from the resulting plaques was isolated and the relevant region sequenced to confirm 
eacTTmutation. 

The following summarizes the oligonucleotides used to generate genetic mutants In the gene 5 
10 exonuclease. The mutations created are underlined. Amino acid and base pair numbers are taken from 
Dunn et al., 166 J. Molec. Biol. 477, 1983. The relevant wild type sequences of the region of gene 5 
mutated are also shown. 



Wild type sequence: 

1Q9' (aa) ... - - . V * ^22 123 

ieu Xreu Axg- 5er Gly Lys Leu Pro GXy Lya Arg Phe Gly Ser Hi3 Ala Leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG TCT CAC GCT TTG GAG 
14677 (T7 bp) 



20 



25 Mutation 1: His 123 Ser 123 

Priaer used: 5* CGC TTT GG^ TCC 2Z'^ GC^ -^2 2' 



30 



Mutanc sequence: 

123 

Leu Leu Arg Ser Giy Lys Leu Pro Gly Lys Arg Phe Gly Ser Ala Leu Giu 

CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GG^ TCC GC* 



Mutation 2: Deletion of Ser 122 and His 123 
Priiuer used: 5* GGA AAA CGC TTT GGC GCC TTG GAG GCG 3' 

6 base deletion 

Mutant sequence: 



L22 12: 



40 Leu Leu Arg Ser Gly Lys Leu Pro Gly Lys Arg P^e Gly Lev GJu 

CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGC 



45 



50 



55 
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10 



15 



20 



25 



30 



40 



Mutation 3: Ser 122, His 123 -> Ala 122, Glu 123 
Primer used: 5* CGC TTT GGG fiCT fiAfi GCT TTG G 3' 
Mutant sequence: 

122 123 

Leu Leu Arg Ser Gly Lys Leu Pjto Gly Lya Arg Phe Gly Ala Glu Ala. Leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG fiCT fiAfi GCT TTG GAG 



Mutation 4: Lys 118, Arg 119 Glu 118, Glu 119 
Primer used: 5' S' G CCC GGfi fiAA TTT GGG TCT CAC GC 3' 

Mutant sequence: 

118 119 

Leu Leu Arg Ser Gly Lys Leu Pro Gly SJji Phe Gly Ser His Ala Leu Giu 

CTT CTG CGT TCC GGC AAG TTG CCC GG5 fiAA TTT GGG TCT CAC GCT TTG GAG 

Mutation 5:. Arg 111, Ser 112, Lys 114 Glu 111, Ala 112, Glu 114 
Primer used : 5- G GGT CTT CTG fiCC GGC fiAG TTG CCC GG 3' 

Mutant sequence: 

111 112 114 

Leu Leu OLl AlA-GIy ^£lu Leu Pro Gly Lys Arg Phe Gly Ser Hzs Ala Leu 

Glu 

CTT CTG SLJi fiCC GGC £AG TTG CCC GGA AAA CGC TTT GGG TCT CAC GCT TTG GAG 
Mutation 6: Kis 59, His 62 Ser 59, Ser 62 

Primer used: 5» ATT GTG TTC ^C AAC GG& IQC AAG TAT GAC G 3' 
35 Wild-type sequence: 

aa: 55 59 . 62 

Leu lie vai Phe His Asn Gly His Lys Tyr Asp val 

CTT ATT GTG TTC CAC AAC GGT CAC AAG TAT GAC GTT 
T7 bp: 14515 



Mutant sequence; 

,59 62 
Leu lie val Phe s&r Asn Gly Ser Lys Tyr Asp Val 
CTT ATT GTG TTC 2£C AAC GG& SlCC AAG TAT GAC GTT 



45 



50 



55 
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MutAtion 7: His 82 Ser 82 

Primer used: 5' GAG TTC 2CC CTT CCT CG 3' 

Wild-type sequence: 

aa: 77 82 

Leu Asn Arg Gla Phe Mis Leu Pro Axg- Glu Asn 

TTG AAC CGA GAG TTC CAC CTT CCT CGT GAG AAC 
T7 bp: X4581 

Mutant sequence: 

82 

Leu Asn Arq Giu Phe s^t Leu Pro Arg Glu Asn 
TTG AAC CGA GAG TTC 2£C ZTT CCT CGT GAG AAC 



Mutation 8; Arg 96, His 99 Leu 96, Ser 99 

Primer used: 5' C22 TTG ATT XQT TCC AAC CTC 3' 

Wild-type sequence: 

aa: 93 96 99 

Vai Leu Ser Arg Leu Tie His Ser Asn Leu Lys Asp Thr Asp 
GTG TTG TCA CGT TTG ATT CAT TCC AAC CTC AAG GAC ACC GAT 
T7 bp: 14629 

Mutant sequence: 

' 96 9 9 

Vai Leu Ser L&u Leu He Smr^ Ser Asn Leu Lys Asp Thr Asp 
GTG TTG TCA CSS TTG ATT 2£T TCC AAC CTC AAG GAC ACC GAT 

Mutation 9: His 190 Ser 190 

Primer used: 5' CT GAC AAA 2CT TAC TTC CCT 3' 
Wild-type sequence: 

aa: 185 190 

Leu Leu Ser Asp Lys His Tyr Phe Pro Pro Gla 

CTA CTC TCT GAC AAA CAT TAC TTC CCT CCT GAG 
T7 bp: 14905 

Mutant seouence : 

190 

Leu Leu Ser Asp Lys £s£ Tyr Phe Pro Pro Glu 
CTA CTC TCT GAC AAA ICT TAC TTC CCT CCT GAG 
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MuMtioa 10: His 218 -» Ser 218 

Priaet ua.d: S' SAC XW GAA 2Cr CGT SC5 GC 3< 

Wild-type sequetic*: 



a.: 214 2" 



GTT 
T7 bp: 14992 



Mutant sequence: 

val A3P lie Glu ifiX AT? Ala Ala Trp I.u 1" 
GTT G^? ATT GAA SCT CGT GCT OCX TGG CTG CTC 



Mutation 11*: Deletion of amino acids 118 to 123 

Projner used: 5' C GGC AAG TTG CCC GGfi GCT T?G GAG GCG TGG G 3* 

A 

IB base deletion 



WUd-type sequence: 

109 (ka) 118 122 123 126 

Lea Leu Arg Ser Gly Lys Xeu Pro Gly Lys Arg PAe Gly 5er Hij Aia Xeu Giu 
GTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG TCT CAC GCT TTG GAG 
14677 (T7 bp) 



Mutant aeouence: 

117 124 

L&u Xeu Azg Ser Gly Lys Leu Pro Giy (6 emino acidj; Ala Leu Glu 

CTT CTG CGT TCC GGC AAG TTG CCC GGS (18 bases) GCT TTG GAG 



Mutation 12: His 123 Glu 123 

Primer used: S* GGG TCT fiXfi GCT TTG G 3* 

Mutant sequence: 

123 

Leu Lea Arg Ser Gly Lys Leu Pro Gly Lys Arg Phe Gly Ser X£lii Ala Leu G*u 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG TCT GAG GCT TTG GAG 
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Mutation 13 : (A:^ 131, Lys 136, Lys 140, Lys 144, Arg 145 

Glu 131, GIu 136. GIu 140, Glu 144, Glu 145) 



U3«d: 5' GGT TM £2C GGC GAG ATG CAG GGT GAA TAG SAA GAC GAC TTT fiAG A7G 

CTT GAA G 3* 



Wild-type sequence; 

129 (aa) 131 136 140 144 145 

Gly Tyr Arg- tmu Gly Glu Wet Ly9 Gly Glu Tyr Lys Asp Asp Phe Lys Ar^ Met Leu Glu Glu 
GGT TAT CGC TTA GGC GAG ATG AAG GGT GAA TAG AAA GAC GAC TTT AAG CGT ATG CTT GAA G 
14737 (T7 bp) 



Mutant sequence: 

129 (aa) 131 136 140 144 145 

Giy Tyr Leu Gly Glu Met SJjl Gly Glu Tyr 21u Asp Ajp Phe Slu SiZU Wat Leu Glu Glu 
GGT TAT 2^ CTC GGC GAG ATG GAG GGT GAA TAC 2AA GAC GAC TTT S5AG SaJi ATG CTT GAA G 
14737 {T7 bp) 

Each mutant gene 5 protein was produced by infection of the mutant phage into^ K38/pGP1-2, as 
follows. The ceils were grown at 30* C to an A590 = 1.0. The temperature was shifted to 42" C for 30 min., to 
induce 17 RNA polymerase. IPTG was added to 0.5 mM, and a iysate of each phage was added at a 
moi = 10. Infected cells were grown at 37" G for 90 min. The cells were then harvested and extracts 
prepared by standard procedures for T7 gene 5 protein. 

Extracts were partially purified by passage over a phosphocellulose and DEAE A-50 column, and 
assayed by measuring the polymerase and exonuclease activities directly, as described above. The results 
are shown in Table 1 . 



Table 1 

SUMMARY OF EXONUCLEASE AND POLYMERASE 
ACTIVITIES OF T7 GENS 5 MUTANTS 



Mutant 



Exonuclease 
activity. % 



Polymerase 
activity, % 



[WHd-typc] 



[100]a 



[100]b 



Muiant 1 

(His 123 "> Scr 123) 



10-25 



>90 



Mutant 2 

(A Scr 122, His 123) 



0-2-0.4 



>90 



Mutants 



(Scr 122, His 123 Ala 122, Glu 123) 



<2 



>90 
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10 



20 



25 



30 



35 



40 



Table I 

SUMMARY OF EXONUCLEASE AND POLYMERASE 
ACTIVITIES OF T7 GENE 5 MUTANTS 

Exonuclease Polymerase 
Mutant acrivirv. \ activity^ % 

Mmant 4 

(Lys 118, Arg 119 -4 Glu 118, Glu 119) <30 >90 

Mutants 

(Arg 1 1 1, Scr 1 12, Lys 1 U 

Glu 111, Ala 112, Glu 114) >75 >90 



75 Mumt 6 

(His 59, ffis 62 Scr 59, Scr 62) 

Mutant 7 

(ffis 82 Ser 82) 



Mutants 

(Arg 96, His 99 Leu 96, Scr 99) 
Mutant 9 

(His 190 Scr 190) 
Mutant 10 

(His 218"»Ser2l8) 



>75 >90 

>75 >90 

>75 >90 

>75 >90 

>75 >90 



Mutant 11 

(A Lys 118, Arg 119, Phc 120, ^ 
Gly 121, Scr 122, His 123) <0.02 >90 



Mutant 12 
(His 123 Glu 123) 



<30 



a . 



Mutant 13 

(Arg 131, Lys 136, Lys 140, Lys 144, Arg 145 

Glu 131, Glu 136, Glu 140, Glu 144, Glu 145) <30 >90 



Zxor.uc lease activity was nieasured on single stranded [3-^3X7 
NA. 100% exonuclease activity corresponds to 5,000 unirs/r.g. 



45 



b. Polymerase activity was measured using single-stranded calf thymus 
DNA. 100% polymerase activity corresponds to 8,000 units/mg. 



Of the seven histidines tested, only one (His 123: mutant 1) has the enzymatic activities characteristic 
of modified T7 DNA polymerase. T7 gene 5 protein was purified from this mutant using DEAE-ceiluiose. 
phosphocellulose. DEAE-Sephadex and hydroxyiapatite chromatography. While the polymerase activity was 
nearly normal {>90% the level of the native enzyme), the exonuclease activity was reduced 4 to 10-fold. 

A variant of this mutant was constructed in which both His 123 and Ser 122 were deleted. The gene 5 
protein purified from this mutant has a 200-500 fold lower exonuclease activity, again with retention of 
>90% of the polymerase activity. 

These data strongly suggest-that His 123 lies in the active site of the exonuclease domain of T7 gene 5 
protein. Furthermore, it is likely that the His 123 Is in fact the residue being modified by the oxidation 
involving iron, oxygen and a reducing agent, since such oxidation has been shown to modify histidine 
residues in other proteins (Levine, J. Biol. Chem. 258: 11823, 1983; and Hodgson et al. Biochemistry 14: 
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5294, 1975). The level of residual exonuclease in mutant 11 is connparable to the levels obtainable by 
chemical modification. 

Although mutations at His residues are described, mutations at nearby sites or even at distant sites may 
also produce mutant enzymes suitable in this invention, e.g.. iys and arg (mutants 4 and 15). Similarly, 
5 although mutations in some His residues have little effect on exonuclease activity that does not necessarily 
indicate that mutations near these residues will not affect exonuclease activity. Mutations which are 
especially effective include those having deletions of 2 or more amino acids, preferably 6-8, for example, 
near the His-123 region. Other mutations should reduce exonuclease activity further, or completely. 

As an example of the use of these mutant strains the following is illustrative. A pGP5-6 (mutation 11)- 
10 containing strain has been deposited with the ATCC (see below). The strain is grown as described above 
and induced as described in Taber et aL J. Biol. Chem. 262:16212 (1987). K38/pTrx-2 cells may be added 
to increase the yield of genetically modified T7 DMA polymerase. 

The above noted deposited strain also contains plasmid pGP1-2 which expresses T7 RNA polymerase. 
This plasmid is described in Tabor et aL, Proc. Nat. Acad. Sci. USA 82:1074, 1985 and was deposited with 
15 the ATCC on March 22, 1985 and assigned the number 40,175. 

Referring to Fig. 10. pGP5-6 includes the following segments: 

1. EcoRl-Sacl- Sma l- Bam HI poiylinker sequence from Ml 3 mpl0 (21 bp). 

2. T7 bp 14309 to 16747, that contains the T7 gene 5, with the following modifications: 
T7 bp 14703 is changed from an A to a G, creating a Sma l site. 

20 T7 bp 14304 to 14321 inclusive are deleted (18 bp). 

3. Sall-Pstl-Hindlll poiylinker sequence from Ml 3 mp 10 (15 bp) 

4. pBR322 bp"l9 (Hindlll site) to pBR322 bp 375 (BamHI site). 

5. T7 bp 22855 to T7 bp 22927, that contains the T7 RNA Polymerase promoter <#>10, with Bam HI 
linkers inserted at each end (82 bp). 

25 6. pBR322 bp 375 ( Bam HI site) to pBR322 bp 4361 (EcoRl site). 

DNA Sequencing Using Modified T7-type DNA Polymerase 

30 DNA synthesis reactions using modified T7-type DNA polymerase result in chain-terminated fragments 
of uniform radioactive intensity, throughout the range of several bases to thousands of bases In length. 
There is virtually no background due to terminations at sites independent of chain terminating agent 
incorporation (i.e. at pause sites or secondary structure impediments). 

Sequencing reactions using modified T7-type DNA polymerase consist of a pulse and chase. By pulse 

35 is meant that a short labelled DNA fragment is synthesized; by chase is meant that the short fragment is 
lengthened until a chain terminating agent is incorporated. The rationale for each step differs from 
conventional DNA sequencing reactions. In the pulse, the reaction is incubated at o''C-37"C for 0.5-4 min 
in the presence of high levels of three nucleotide triphosphates (e.g., dGTP, dCTP and dTTP) and limiting 
levels of one other labelled, carrier-free, nucleotide triphosphate, e.g., PS] dATP. Under these conditions 

40 the modified polymerase is unable to exhibit its processive character, and a population of radioactive 
fragments will be synthesized ranging in size from a few bases to several hundred bases. The purpose of 
the pulse is to radioactively label each primer, incorporating maximal radioactivity while using minimal 
levels of radioactive nucleotides. In this example, two conditions in the pulse reaction (low temperature, e.g., 
from 0-20 "C, and limiting levels of dATP, e.g., from 0.1 uM to luM) prevent the modified T7-type DNA 

45 polymerase from exhibiting its processive character. Other essential environmental components of the 
mixture will have similar effects, e.g., limiting more than one nucleotide triphosphate or increasing the ionic 
strength of the reaction. If the primer is already labelled (e.g.. by kinasing) no pulse step is required. 

In the chase, the reaction is incubated at 45** 0 for 1-30 min in the presence of high levels (50-500uM) 
of all four deoxynucleoside triphosphates and limiting levels (1-50U.M) of any one of the four chain 

50 terminating agents, e.g., dideoxynucleoside triphosphates, such that DNA synthesis is terminated after an 
average of 50-600 bases. The purpose of the chase is to extend each radioactively labeled primer under 
conditions of processive DNA synthesis, terminating each extension exclusively at correct sites in four 
separate reactions using each of the four dideoxynucleoside triphosphates. Two conditions of the chase 
(high temperature, e.g., from 30-50 "C) and high levels (above 50iulM) of all four deoxynucleoside 

55 triphosphates) allow the modified T7-type DNA polymerase to exhibit its processive character for tens of 
thousands of bases; thus the same polymerase molecule will synthesize^ from the primer-template until a 
dideoxynucleotide is incorporated. At a chase temperature of 45 °C synthesis occurs at >700 
nucleotides/sec. Thus, for sequencing reactions the chase is complete in less than a second, ssb increases 
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processivity. for example, when using dlTP, or when using low temperatures or high ionic strength, or low 
levels of triphosphates throughout the sequencing reaction. 

Either [a35S]dATP,[a32p]dATP or fluorescently labelled nucleotides can be used in the DNA sequencing 
reactions with modified T7-type DNA polymerase. If the fluorescent analog is at the 5 end of the primer, 
then no pulse step is required. 

Two components determine the average extensions of the synthesis reactions. First Is the length of 
time of the pulse reaction. Since the pulse is done in the absence of chain terminating agents, the longer 
the pulse reaction time, the longer the primer extensions. At O'C the polymerase extensions average 10 
nucleotides/sec. Second is the ratio of deoxyribonucleoslde triphosphates to chain terminating agents in the 
chase reaction. A modified T7-type DNA polymerase does not discriminate against the incorporation of 
these analogs, thus the average length of extension in the chase is four times the ratio of the deox- 
ynucleoside triphosphate concentration to the chain terminating agent concentration in the chase reaction. 
Thus, in order to shorten the average size of the extensions, the pulse time is shortened, e.g., to 30 sec, 
and/or the ratio of chain terminating agent to deoxynucleoslde triphosphate concentration is raised in the 
chase reaction. This can be done either by raising the concentration of the chain terminating agent or 
lowering the concentration of deoxynucleoslde triphosphate. To increase the average length of the 
extensions, the pulse time is increased, e.g.. to 3-4 min, and/or the concentration of chain terminating agent 
is lowered (e.g., from 20ixM to 2uM) in the chase reaction. 



Example 2: DNA sequencing using modified T7 DNA polymerase 

The following is an example of a sequencing protocol using dideoxy nucleotides as terminating agents. 

QjLLl of single-stranded M13 DNA (mGP1-2, prepared by standard procedures) at 0.7 mM concentration 
is mixed with 1 ii\ of complementary sequencing primer (standard universal 17-mer. 0.5 pmole primer / al) 
and 2,5 ii\ 5X annealing buffer (200 mM Tris-HCi, pH 7.5, 50 mM MgCia) heated to 65" C for 3 min. and 
slow cooled to room temperature over 30 min. In the pulse reaction, 12.5 ul of the above annealed mix was 
mixed with 1 ul dithiothreitol 0,1 M, 2 ul of 3 dNTPs (dGTP. dCTP, dTTP) 3 mM each (P.L Biochemicals, in 
TE), 2.5 ul [a^ssjdATP, (1500 Ci/mmol, New England Nuclear) and 1 ul of modified T7 DNA polymerase 
described in Example 1 (0.4 mg/ml. 2500 units/ml, i.e. 0.4 ug, 2.5 units) and incubated at O' C, for 2 min, 
after vortexing and centrifuging in a microfuge for 1 sec. The time of incubation can vary from 30 sec to 20 
min and temperature can vary from o" C to 37* C. Longer times are used for determining sequences distant 
from the primer. 

4.5 ul aliquots of the above pulse reaction are added to each of four tubes containing the chase mixes, 
preheated to 45' C. The four tubes, labeled G, A, T, C, each contain trace amounts of either dideoxy (dd) 
G A T, or C (P-L Biochemicals). The specific chase solutions are given below. Each tube contains 1.5 ul 
dATP imM. 0.5 Ul 5X annealing buffer (200 mM Tris-HCI, pH 7.5, 50mM MgCb). and 1.0 ul ddNTP 100 
uM (where ddNTP corresponds to ddG,A,T or C in the respective tubes). Each chase reaction is incubated 
at 45*0 (or 30''G-50"C) for 10 min, and then 6 ul of stop solution (90% formamide, lOmM EDTA, 0.1% 
xylenecyanol) is added to each tube, and the tube placed on ice. The chase times can vary from 1-30 min. 

The sequencing reactions are run on standard, 6% polyacrylamide sequencing gel in 7M urea, at 30 
Watts for 6 hours. Prior to running on a gel the reactions are heated to 75° C for 2 min. The gel is fixed in 
10% acetic acid, 10% methanol, dried on a gel dryer, and exposed to Kodak 0M1 high-contrast 
autoradiography film overnight. 



Example 3: DNA sequencing using limiting concentrations of dNTPs 



In this example DNA sequence analysis of mGP1-2 DNA is performed using limiting levels of all four 
deoxyribonucleoslde triphosphates in the pulse reaction. This method has a number of advantages over the 
protocol in example 2. First, the pulse reaction runs to completion, whereas in the previous protocol it was 
necessary to interrupt a time course. As a consequence the reactions are easier to run. Second, with this 
method it is easier to control the extent of the elongations in the pulse, and so the efficiency of labeling of 
sequences near the primer (the first 50 bases) is increased approximately 10-fold. 

7 ul of 0.75 mM single-stranded M13 DNA (mGP1-2) was mixed with 1ul of complementary sequencing 
primer (17-mer. 0.5 pmole primer/ul) and 2 ul 5X annealing buffer (200 mM Tris-HCI pH 7.5, 50 mM MgCb. 
250 mM NaCl) heated at 65' C for 2 min, and slowly cooled to room temperature over 30 min. In the pulse 
reaction 10 ul of the above annealed mix was mixed with 1 ul dithiothreitol 0.1 M, 2 ul of 3 dNTPS (dGTP, 
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dCTP, dTTP) 1.5 lilA each, 0.5 ii\ [a^^SldATP, (alOuM) (about 10uM, 1500 Ci/mmo!, New England 
Nuclear) and 1 li\ modified T7 DNA polymerase (0.1 mg/ml, 1000 units/ml, i.e., 0.2 u.g. ^2 units) and 
incubated at 37' C for 5 min. (The temperature and time of incubation can be varied from 20° C-45 C and 
1-60 min., respectively.) 

5 3,5 ii\ allquots of the above pulse reaction were added to each of four tubes containing the chase 
mixes, which were preheated to 37° C. The four tubes, labeled G, A, T, C, each contain trace amounts of 
either dideoxy G, A, T, C. The specific chase solutions are given below. Each tube contains 0.5 ul 5X 
annealing buffer (200 mM Tris-HCI pH 7.5, 50 mM MgCb. 250 mM NaCI), 1 u\ 4dNTPS (dGTP. dATP, 
dTTP, dCTP) 200 uM each, and 1.0 ul ddNTP 20 aM. Each chase reaction is incubated at 37° C for 5 min 

10 (or 20 " C-45 "C and 1-60 min respectively), and then 4 ul of a stop solution (95% formamide, 20 mM 
EDTA, 0.05% xylene-cyanol) added to each tube, and the tube placed on ice prior to running on a standard 
polyacrylamide sequencing gel as described above. 

75 Example 4: Replacement of dGTP with dITP for DNA sequencing 

In order to sequence through regions of compression in DNA. i.e., regions having compact secondary 
structure, it is common to use dlTP (Mills et al., 76 Proc. NatL Acad. Sci. 2232. 1979) or deazaguanosine 
triphosphate (deaza GTP, Mizusawa et ai., 14 Nuc. Acid Res. 1319, 1986). We have found that both analogs 

20 function well with T7-type polymerases, especially with dITP in the presence of ssb. Preferably these 
reactions are performed with the above described genetically modified T7 polymerase, or the chase 
reaction is for 1-2 min., and/or at 20*' C to reduce exonuclease degradation. 

Modified T7 DNA polymerase efficiently utilises dITP or deaza-GTP in place of dGTP. dITP is 
substituted for dGTP in both the pulse and chase mixes at a concentration two to five times that at which 

25 dGTP is used. In the ddG chase mix ddGTP is still used (not ddlTP). 

The chase reactions using dITP are sensitive to the residual low levels (about 0.01 units) of ex- 
onuclease activity. To avoid this problem, the chase reaction times should not exceed 5 min when dITP is 
used. It is recommended that the four dITP reactions be run in conjunction with, rather than to the exclusion 
of. the four reactions using dGTP. If both dGTP and dITP are routinely used, the number of required mixes 

30 can be minimized by: (1) Leaving dGTP and dITP out of the chase mixes, which means that the four chase 
mixes can be used for both dGTP and dlTP chase reactions. (2) Adding a high concentration of dGTP or 
dITP (2ul at 0.5 mM and 1-2.5 mM respectively) to the appropriate pulse mix. The two pulse mixes then 
each contain a low concentration of dCTP.dTTP and [a35S]dATP, and a high concentration of either dGTP 
or dITP. This modification does not usually adversely effect the quality of the sequencing reactions, and 

35 reduces the required number of pulse and chase mixes to run reactions using both dGTP and dITP to six. 

The sequencing reaction is as for example 3. except that two of the pulse mixes contain a) 3 dNTP mix 
for dGTP: 1.5 M.M dCTP,dTTP, and 1 mM dGTP and b) 3 dNTP mix for dITP: 1.5 uM dCTP,dTTP, and 2 
mM dITP. In the chase reaction dGTP is removed from the chase mixes (i.e. the chase mixes contain 30 
uM dATP.dTTP and dCTP. and one of the four dideoxynucieotides at 8 uM), and the chase time using 

40 dITP does not exceed 5 min. 



Deposits 

45 Strains K38/pGP5-5/pTrx-2, K38/pTrx-1 and Ml 3 mGP1-2 have been deposited with the ATCG and 
assigned numbers 67,287. 67,286. and 40.303 respectively. These deposits were made on January 13, 
1987. Strain K38/pGP1 -2/pGP5-6 was deposited with the ATCC. On December 4, 1987, and assigned the 
number 67571 . 

Applicants' and their assignees acknowledge their responsibility to replace these cultures should they 
50 die before the end of the term of a patent issued hereon, 5 years after the last request for a culture, or 30 
years, whichever is the longer, and its responsibility to notify the depository of the issuance of such a 
patent, at which time the deposits will be made irrevocably available to the public. Until that time the 
deposits will be made irrevocably available to the Commissioner of Patents under the terms of 37 CFR 
Section 1-14 and 35 USG Section 112. 



Other Embodiments 
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Other embodiments are within the following claims. 

Other uses of the modified DNA polymerases of this invention, which take advantage of their 
processivity, and lack of exonuclease activity, include the direct enzymatic amplification of genomic DNA 
sequences. This has been described, for other polymerases, by Saiki et ai., 230 Science 1350, 1985; and 

5 Scharf, 233 Science 1076, 1986. 

Referring to Fig. 6, enzymatic amplification of a specific DNA region entails the use of two primers 
which anneal to opposite strands of a double stranded DNA sequence in the region of interest, with their 3 
ends directed toward one another (see dark arrows). The actual procedure involves multiple (10-40, 
preferably 16-20) cycles of denaturation, annealing, and DNA synthesis. Using this procedure it Is possible 

10 to amplify a specific region of human genomic DNA over 200,000 times. As a result the specific gene 
fragment represents about one part in five, rather than the initial one part in a million. This greatly facilitates 
both the cloning and the direct analysis of genomic DNA. For diagnostic uses, it can speed up the analysis 
from several weeks to 1-2 days. 

Unlike Kienow fragment, where the amplification process is limited to fragments under two hundred 

75 bases in length, modified T7-type DNA polymerases should (preferably in conjuction with E. coli DNA 
binding protein, or ssb, to prevent "snapback formation of single stranded DNA) permit the amplification of 
DNA fragments thousands of bases in length. 

The modified T7-type DNA polymerases are also suitable in standard reaction mixtures: for a) filling in 
s' protruding termini of DNA fragments generated by restriction enzyme cleavage; In order to, for example, 

20 produce blunt-ended double stranded DNA from a linear DNA molecule having a single stranded region with 
no 3' protruding termini; b) for labeling the 3' termini of restriction fragments, for mapping mRNA start sites 
by S1 nuclease analysis, or sequencing DNA using the Maxam and Gilbert chemical modification 
procedure; and c) for in vitro mutagenesis of cloned DNA fragments. For example, a chemically synthesized 
primer which contains'specTfic mismatched bases is hybridized to a DNA template, and then extended by 

26 the modified T7-type DNA polymerase. In this way the mutation becomes permanently incorporated into the 
synthesized strand. It is advantageous for the polymerase to synthesize from the primer through the entire 
length of the DNA. This is most efficiently done using a processive DNA polymerase. Alternatively 
mutagenesis is performed by misincorporation during DNA synthesis (see above). This application is used 
to mutagenize specific regions of cloned DNA fragments. It is important that the enzyme used lack 

30 exonuclease activity. By standard reaction mixture is meant a buffered solution containing the polymerase 
and any necessary deoxynucleosides. or other compounds. 



Claims 

35 

1, A method of amplification of a DNA sequence comprising annealing a first and second primer to 
opposite strands of a double stranded DNA sequence and incubating the annealed mixture with a DNA 
polymerase characterized in that said polymerase is a processive T7-type DNA polymerase, having less 
than 50% of the exonuclease activity of the naturally associated level of exonuclease activity of said 

40 polymerase. 

2. A method as claimed in claim 1 further characterized in that said DNA polymerase has less than 500 
units of exonuclease activity per mg of polymerase, and in that said first and second primers anneal to 
opposite strands of said DNA sequence with their 3 ends directed toward each other after annealing, and 
with the DNA sequence to be amplified located between the two annealed primers. 

45 3. A method as claimed in claim 1 or 2 further characterized in that said polymerase possesses 
sufficient processivity to remain bound to said DNA sequence for at least 500 bases before dissociating. 

4. A method as claimed in claim 1 , 2 or 3 further characterized in that said polymerase has less than 
1 % of the exonuclease activity naturally associated with said polymerase. 

5. A method as claimed in any of the preceding claims in which said polymerase is T7 DNA 
50 polymerase. 
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FIGURE 6 
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FIGUSE 7 



10 

TTCTTCTCAT 
60 

TCTGGCGTCA 
110 

AATCACTGCA 
160 

GTTTTTTGCG 
210 

TGTTGACAAT 
260 

AACAATTTCA 
310 

GTTACACCAA 
360 

ATATGAGCGA 
410 

GTACTCAAAG 
4 60 

CGGTCCGTGC 



20 

GTTTGACAGC 
70 

GGCAGCCATC 
120 

TAATTCGTGT 
170 

CCGACATCAT 
220 

TAATCATCGG 
270 

CACAGGAAAC 
320 

CAACGAAACC 
370 

TAAAATTATT 
420 

CGGACGGGGC 
470 

AAGATGATCG 



30 

TTATCATCGA 
80 

GGAAGCTGTG 
130 

CGCTCAAGGC 
180 

AACGGTTCTG 
230 

CTCGTATAAT 
280 

AGGGGATCCG 
330 

AACACGCCAG 
380 

CACCTGACTG 
430 

GATCCTCGTC 
480 

CCCCGATTCT 



40 

CTGCACGGTG 
90 

GTATGGCTGT 
140 

GCACTCCCGT 
190 

GCAAATATTC 
240 

GTGTGGAATT 
290 

TCAACCTTTA 
340 

GCTTATTCCT 
390 

ACGACAGTTT 
440 

GATTTCTGGG 
490 

GGATGAAATC 



50 

CACCAATGCT 
100 

GCAGGTCGTA 
150 

TCTGGATAAT 
200 

TGAAATGAGC 
250 

GTGAGCGGAT 
300 

GTTGGTTAAT 
350 

GTGGAGTTAT 
400 

TGACACGGAT " 
450 

CAGAGTGGTG 
500 

GCTGACGAAT 
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FIGJJRS 7 (continiieG) 



510 

ATCAGGGCAA 
560 

ACTGCGCCGA 
610 

AAACGGTGAA 
660 

TGAAAGAGTT 
710 

TGCCCCGTCG 
760 

GTTGACGGAT 
810 

CAACAACGAA 
860 

CGATAAAATT 
910 

AAGCGGACGG 
960 

TGCAAGATGA 

1010 
CAAACTGACC 

1060 
CGAAATATGG 

1110 
GAAGTGGCGG 

1160 
GTTCCTCGAC 

1210 
TCGCTAAAAA 

12 60 
GATCCCCCTG 

1310 
ATGCAGCTCC 

13 60 
CAGACAAGCC 

1410 
CAGCCATGAC 

1460 
ATGCGGCATC 

1510 
ATACCGCACA 

1560 
TTCCTCGCTC 

1610 
TATCAGCTCA 

1660 
AACGCAGGAA 

1710 
TAAAAAGGCC 

1760 
AGCATCACAA 
1810 
CTATAAAGAT 



520 

ACTGACCGT? 

570 

AATATGGCAT 
620 

GTGGCGGCAA 
670 

CC7CGACGCT 
720 

CTAAAAACTG 
770 

CCCCGGGGAT 
820 

ACCAACACGC 
870 

ATTCACCTGA 
920 

GGCGATCCTC 
970 

TCGCCCCGAT 
1020 
GTTGCAAAAC 
1070 
CATCCGTGGT 
1120 
CAACCAAAGT 
1170 
GCTAACCTGG 
1220 
CTGGACGCCC 
1270 
CCTCGCGCGT 
1320 
CGGAGACGGT 
1370 
CGTCAGGGCG 
1420 
CCAGTCACGT 
1470 
AGAGCAGATT 
1520 
GATGCGTAAG 
1570 
AC7GACTCGC 
1620 
CTCAAAGGCG 
1670 
AGAACATGTG 
1720 
GCGTTGCTGG 
1770 
AAATCGACGC 
1820 



530 
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FIGuHE 7 (continued) 
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FIGURE 7 (continued) 
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•TAC GACTCACTAT 

ATTCGCAAGj GTGGCCTt| TGAITGaI^? TCITCCgI^? AATACGA^^ 



CTAAAGGTAA CTTGAACc| CGTGACa|; TAGAGTCG^l CTTCGCG^I? 
GCGTAACGCC A^ATCAATAC GACICACtI? AGAGGGaJI^ ACTCAAgIJ? 
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ATT ACTAAGAGAG 
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.CGT TCTAACCGTA 

ACCAAAGGTC GCAAGTIGAA TAAGACT^ CGTGACCGc? CTCACAAgIg' 



GAGATTTAAA TTAAAGAATT ACTAAGaIIg GACTTTaIg? ATCCCTAic? 
TCGAAAAGAT GACCAAACGT TCTAACCGTA ATGCTCGTGA CTTCGAGgJa 



tagctgggag ggtcagtaag atgggacgtt tatatag^gg taatctgI?a 

CCGGATCCGG TATGAAGAGA TTGTTAaIt^ ACGATAaJ^A ATAGGAgII^ 

tcaatatgat cgtttctgac atcgaag??a acgccctct? agagagcg?? 
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FIGURE 8 (continijed) 
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GTTCCTGCAT TGACCAAACT GGCAAAGTTG CAATTGAACC GAGAGTTCCA 
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CCTTCCTCGT GAGAACTGTA TTGACACCCT TGTGTTGTCA CGTTTGATTC 
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ATTCCAACCT CAAGGACACC GATATGGGTC TTCTGCGTTC CGGCAAGTTG 
910 920 930 940 950 

CCCGGAAAAC GCTTTGGGTC TCACGCTTTG GAGGCGTGGG GTTATCGCTT 
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AGGCGAGATG AAGGGTGAAT ACAAAGACGA CTTTAAGCGT ATGCTTGAAG 
1010 1020 1030 1040 1050 

AGCAGGGTGA AGAATACGTT GACGGAATGG AGTGGTGGAA CTTCAACGAA 
1060 1070 1080 1090 1100 

GAGATGATGG ACTATAACGT TCAGGACGTT GTGGTAACTA AAGCTCTCCT 
1110 1120 1130 1140 1150 

TGAGAAGCTA CTCTCTGACA AACATTACTT CCCTCCTGAG ATTGACTTTA 
1160 1170 1180 1190 1200 

CGGACGTAGG ATACACTACG TTCTGGTCAG AATCCCTTGA GGCCGTTGAC 
1210 1220 1230 1240 1250 

ATTGAACATC GTGCTGCATG GCTGCTCGCT AAACAAGAGC GCAACGGGTT 
1260 1270 1280 1290 1300 

CCCGTTTGAC ACAAAAGCAA TCGAAGAGTT GTACGTAGAG TTAGCTGCTC 

r,r.r.^r. ^^^^ ^^^^ ^330 . 13-4 0 1350 

GCCGCTCTGA GTTGCTCCGT AAATTGACCG AAACGTTCGG CTCGTGGTAT 

1370 1380 1390 1400 

CAGCCTAAAG GTGGCACTGA GATGTTCTGC CATCCGCGAA CAGGTAAGCC 
1410 1420 1430 1440 1450 

ACTACCTAAA TACCCTCGCA TTAAGACACC TAAAGTTGGT GGTATCTT-^A 

1460 1470 1480 1490 1500 

AGAAGCCTAA GAACAAGGCA CAGCGAGAAG GCCGTGAGCC TTGCGAACTT 

1510 1520 . 1530 1540 1550 

GATACCCGCG AGTACGTTGC TGGTGCTCCT TACACCCCAG TTGAACATGT 

1560 1570 1580 1590 1600 

TGTGTTTAAC CCTTCGTCTC GTGACCACAT TCAGAAGAAA CTCCAAGAGG 

1610 1620 1630 1640 1650 

CTGGGTGGGT CCCGACCAAG TACACCGATA AGGGTGCTCC TGTGGTGGAC 

1660 1670 1680 1690 1700 

GATGAGGTAC TCGAAGGAGT ACGTGTAGAT GACCCTGAGA AGCAAGCCGC 

1710 1720 1730 1740 1750 

TATCGACCTC ATTAAAGAGT ACTTGATGAT TCAGAAGCGA ATCGGACAGT 

1760 1770 1780 1790 1800 

CTGCTGAGGG AGACAAAGCA TGGCTTCGTT ATGTTGCTGA GGATGGTAAG 

1810 1820 1830 1840 1850 

ATTCATGGTT CTGTTAACCC TAATGGAGCA GTTACGGGTC GTGCGACCCA 

1860 1870 1880 1890 1900 

TGCGTTCCCA AACCTTGCGC AAATTCCGGG TGTACGTTCT CCTTATGGAG 

191° 1920 1930 1940 1950 

AGCAGTGTCG CGCTGCTTTT GGCGCTGAGC ACCATTTGGA TGGGATAACT 
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FIGLKE 8 (cc5ntinued) 



1960 1970 1980 1990 2000 

GGTAAGCC7T GGGTTCAGGC TGGCATCGAC GCATCCGG7C TTGAGCTACG 
2010 2020 2030 2040 2050 

CTGCTTGGCT CACTTpATGG CTCGCTTTGA TAACGGCGAG TACGCTCACG 
2060 2070 2080 2090 2100 

AGATTCTTAA CGGCGACATC CACACTAAGA ACCAGATAGC TGCTGAACTA 
2110 2120 2130 2140 2150 

CCTACCCGAG ATAACGCTAA GACGTTCATC TATGGGTTCC TCTATGGTGC 
2160 2170 2180 2190 2200 

TGGTGATGAG AAGATTGGAC AGATTG7TGG TGCTGGTAAA GAGCGCGGTA 
2210 2220 2230 2240 2250 

AGGAACTCAA GAAGAAATTC CTTGAGAACA CCCCCGCGAT TGCAGCACTC 
2260 2270 2280 2290 2300 

CGCGAGTCTA TCCAACAGAC ACTTGTCGAG TCCTCTCAAT GGGTAGCTGG 
2310 2320 2330 2340 2350 

TGAGCAACAA GTCAAGTGGA AACGCCGCTG GATTAAAGGT CTGGATGGTC 
2360 2370 2380 2390 2400 

GTAAGGTACA CGTTCGTAGT CCTCACGCTG CCTTGAATAC CCTACTGCAA 
2410 2420 2430 2440 2450 

TCTGCTGG7G C7C7CATCTG CAAAC7G7GG ATTA7CAAGA CCGAAGAGA7 
2460 2470 2480 2490 2500 

GC7CG7AGAG AAAGGCT7GA AGCA7GGC7G GGA7GGGGAC 77TGCGTACA 
2510 2520 2530 2540 2550 

TGGCATGGG7 ACA7GA7GAA ATCCAAGTAG GCTGCCG7AC CGAAGAGA77 
2560 2570 2580 2590 2600 

GC7CAGG7GG 7CA7TGAGAC CGCACAAGAA GCGA7GCGC7 GGG77GGAGA 
2610 2620 2630 2640 2650 

CCACTGGAAC TTCCGG7G7C 7TC7GGATAC CGAAGG7AAG A7GGG7CC7A 
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A77GGGCGA7 77GCCAC7GA 7ACAGGAGGC 7AC7CA7GAA CGAAAGACAC 
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TCCA7G7G7G 7CGGCAAGAG AAACAAAGGC A7AAAAC7A7 AGGAGAAA7'^ 
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A7TA7GGC7A 7GACAAAGAA A777CCGGA7 C 
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FIOJIS 9 
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FIGURE 9 (continijed) 
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7G777AAGAA 
1550 
GGC7CC7777 
1600 
77A77CGCAA 
1650 
7G77GAAAG7 
1700 
7C7GGAAAGA 
1750 
C7G7GGAA7G 
1800 
77ACGG7ACA 
1850 
G7GGC7C7GA 
1900 
AC7AAACC7C 
1950" 
CAACCC7C7C 
2000 
A7CC7AA7CC 
2050 
CAGAA7AA7A 
2100 
CAC7G77ACT 
2150 
C7G7A7CA7C 
2200 
GAC7GCGC77 
2250 
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FIGURE 9 (continued) 



TCCATTCTGG 
2260 
TCGTCTGACC 
2310 
TGGTTCTGGT 
2360 
AGGGTGGCGG 
2410 
GATTTTGATT 
2460 
AAATGCCGAT 
2510 
CTGTCGCTAC 
2560 
TCCGGCCTTG 
2610 
TTCCCAAATG 
2660 
ATTTCCGTCA 
2710 
TTTGTCTTTA 
2760 
AATAAACTTA 
2810 
TTATGTATGT 
2860 
TAATCATGCC 
2910 
TTCCT7CTGG 
2960 
CTTCGGTAAG 
3010 
GGCTTAACTC 
3060 
CCCTCTGACT 
3110 
TCCCTGTTTT 
3160 
ACG7TAAACA 
3210 
TGTTTATTTT 
32 60 
TTGGTAAGAT 
3310 
CTTGATTTAA 
3360 
GCC7CGCGTT 
3410 
CTATTGGGCG 
3460 
GTTCTCGATG 
3510 
GGAAAGACAG 
3560 



CTTTAATGAA 
2270 
TGCCTCAACC 
2320 
GGCGGCTCTG 
2370 
CTCTGAGGGA 
2420 
ATGAAAAGAT 
2470 
GAAAACGCGC 
2520 
TGATTACGGT 
2570 
CTAATGGTAA 
2620 
GCTCAAGTCG 
2670 
ATATTTACCT 
2720 
GCGCTGGTAA 
2770 
TTCCGTGGTG 
2820 
ATTTTCTACG 
2870 
AGTTCTTTTG 
2920 
TAACTTTGTT 
2970 
ATAGCTATTG 
3020 
AATTCTTGTG 
3070 
TTGTTCAGGG 
3120 
TATG7TATTC 
3170 
AAAAA7CG77 
3220 
G7AAC7GGCA 
3270 
7CAGGA7AAA 
3320 
GGC77CAAAA 
3370 
C77AGAA7AC 
3420 
CGG7AA7GA7 
3470 
AG7GCGG7AC 
352C 
CCGATTATTG 
3570 



GA7CCA77CG 
2280 
7CCTG7CAA7 
2330 
AGGG7GG7GG 
2380 
GGCGG77CCG 
2430 
GGCAAACGC7 
2480 
7ACAG7C7GA 
2530 
GCTGCTATCG 
2580 
TGG7GC7AC7 
2630 
G7GACGG7GA 
2680 
TCCC7CCC7C 
2730 
ACCA7A7GAA 
2780 
7C777GCG77 
2830 
777GC7AACA 
2880 
GG7ATTCCG7 
2930 
CGGCTA7CTG 
2980 
C7AT77CA77 
3030 
GG7TA7C7C7 
. 3080 
TGTTCAG77A 
3130 
7CTCTGTAAA 
3180 
TCTTA77TGG 
3230 
AATTAGGC7C 
3280 
A77G7AGC7G 
3330 
CCTCCCGCAA 
3380 
CGGA7AAGCC 
3430 
7CC7ACGA7G 
3480 



777G7GAATA 
2290 
GC7GGCGGCG 
2340 
C7C7GAGGG7 
2390 
G7GG7GGC7C 
2440 
AATAAGGGGG 
2490 
CGC7AAAGGC 
2540 
ATGGT77CA7 
2590 
GGTGA7777G 
2640 
7AA77CACC7 
2690 
AA7CGG77GA 
2740 
TTT7C7A77G 
2790 
7C7777A7A7 
2840 
7AC7GCG7AA 
2890 
TAT7AT7GCG 
2940 
C7TAC7T77C 



2990 
G777C7TGCT 
3040 
C7GA7A77AG 
3090 
A77C7CCCG7 
3140 
GGC7GC7A7T 
3190 
A77GGGATAA 
3240 
7GGAAAGACG 
3290 
GG7GCAAAA7 
•3340 
G7CGGGAGG7 
3390 
T7C7A7ATCT 



77GG777AA7 



3530 
A7TGG7TTCT 
3560 



3440 
AAAA7AAAAA 
3490 
ACCCGT7C77 
3540 
ACA7GC7CG7 
3590 



7CAAGGCC;j^ 
2300 
GC7C7GG7GG 
2350 
GGCGG77C7G 
2400 
7GG7TCCGG7 
2450 
CTATGACCGA 
2500 
AAAC77GA77 
2550 
7GG7GACG77 
2600 
C7GGC7CT.':A 
2650 
7TAATGAATA 
2700 
A7G7CGCCC7 
2750 
A77G7GACAA 
2800 
G77GCCACCT 
2850 
7AAGGAG7C7 
2900 
7T7CC7CGG7 
2950 
T7AAAAAGGG 
3000 
C77A77A77G 
3050 
CGC7CAAT7A 
"3100 
C7AA7GCGC7 
3150 
77CA77777G 
3200 
ATAA7A7GGC 
3250 
C7CG77AGCG 
3300 
AGCAAC7AA7 
3350 
TCGC7AAAAC 
3400 
GA7T7GCT7G 
3450 
CGGC77GC77 
3500 
GGAA7GATAA 
3550 
AAAT7AG7AT 
3600 
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FIGLj\z, 9 (continued) 



GGGATATTAT 
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TAATTATGAT 
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ATTCTTATTT 
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TTATCACACG 
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CAAACCATTA 


3910 


3920 


3930 


3940 
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AATTTAGGTC 
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ATTAACTAAA 


ATATATTTGA 


AAAAGTTTTC 
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TCGCGTTCTT 
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TTGGATTTGC 
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GAGGTTAAAA 


AGGTAGTCTC 


TCAGACCTAT 
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4070 
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4090 
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GATTTTGATA 


AATTCACTAT 


TGACTCTTCT 


CAGCGTCTTA 


ATCTAAGCTA 


4110 


4120 


4130 


4140 
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TCGCTATGTT 


TTCAAGGATT 


CTAAGGGAAA 


ATTAATTAAT 


AGCGACGATT 


4160 


4170 


4180 


4190 


4200 


TACAGAAGCA 


AGGTTATTCA 


CTCACATATA TTGATTTATG TACTGTTTCC 


4210 


4220 


4230 


4240 


4250 


ATTAAAAAAG 


GTAATTCAAA 


TGAAATTGTT 


AAATGTAATT 


AATTTTGTTT 


4260 


4270 


4280 


4290 
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TCTTGATGTT 


TGTTTCATCA 


TCTTCTTTTG 


CTCAGGTAAT 


TGAAATGAAT 


4310 


4320 


4330 


4340 
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AATTCGCCTC 


TGCGCGATTT 


TGTAACTTGG 


TATTCAAAGC 


AATCAGGCGA 


4360 


4370 


4380 


4390 


4400 


ATCCGTTATT 


GTTTCTCCCG 


ATGTAAAAGG 


TACTGTTACT 


GTATAT.TCAT 


4410 


4420 


■ 4430 


4440 


4450 


CTGACGTTAA 


ACTTGAAAAT 


CTACGCAAT7 


TCTTTATTTC 


TGTTTTACGT 


44 60 


4470 


4480 


4490 


4500 


GCTAATAATT 


TTGATATGGT 


TGGTTCAATT 


CCTTCCATAA 


TTCAGAAGTA 


4510 


4520 


4530 


4540 


4550 


TAATCCAAAC 


AATCAGGTAT 


ATATTGATGA 


ATTGCCATCA 


TCTGATAATC 


4560 


4570 


4580 


4590 


4600 
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TGATAATTCC 


GCTCCTTCTG 


GTGGTTTCTT 
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TGTTTGTAAA 
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ATCTATTGAC 


GGCTCTAATC 


TATTAGTTGT 


TAGTGCACCT 


4760 


4770 


4780 


4790 


4800 


AAAGATATTT 


TAGATAACCr 


TCCTCAATTC 


CTTTCTACTG 


TTGA7TTGCC 


4810 


4820 


4830 


4840 


4850 


AACTGACCAG 


ATATTGATTG 


AGGGTTTGAT 


ATTTGAGGTT 


CAGCAAGGTG 


4860 


4870 


4880 


4890 


4900 


ATGCTTTAGA 


T 1* 1* ^V^I"^ ^jT 


GCTGCTGGCT 


CTCAGCGTGG 


CACTGTTGCA 


4910 


4920 


4S30 


4940 


4950 



EP 0 386 857 A2 



FIGciRE 9 (continued) 



^ O O ^ ^ *T» ^ *y ^ ^ 
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GTTTTATCTT 
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GTGATGTTAT 
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GTCAAAGCAA 


CCATAGTACG 
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AGCGCGGCGG 


GTGTGGTGGT 


TACGCGCAGC 


GTGACCGCTA 
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GACGGTTTTT 
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CGCCCTTTGA 


CGTTGGAGTC 


CACGTTCTTT 


AATAGTGGAC 


TCTTGTTCCA 
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5840 
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ACACTCAACC 


CTATCTCGGG 


CTATTCTTTT 


GATTTATAAG 
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GATTTCGGAA 


CCACCATCAA 


ACAGGATTTT 


CGCCTGCTGG 
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GGCAAACCAG 


CGTGGACCGC 


TTGCTGCAAC 


TCTCTCAGGG 


CCAGGCGGTG 
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GGCTCGTATG 
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TTGTGAGCGG 


ATAACAATTT 
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C CAT G ATT AC 


GAATTCGAGC 


TCGCCCGGGG 
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6280 


6290 


6300 



EP 0 386 857 A2 



FIGuRE 9 (continued) 

ATCTGCCTGA ATAGGTACGA TTTACTAACT GGAAGAGGCA CTAAA-GAAC 
6310 6320 6350 S340 SS'^O 

ACGATTAACA TCGCTAAGAA CGACTTCTCT GACATCGAAC TGGCTGCTAT 
6360 6370 6330 5390 6400 

CCCGTTCAAC ACTCTGGCTG ACCATTACGG TGAGCGTTTA GC7CGCGAAC 
6410 6420 6430 6440 64'=0 

AGTTGGCCCT TGAGCATGAG TCTTACGAGA TGGGTGAAGC ACGC7TCCGC 
6460 6470 6480 6490 6500 

AAGATGTTTG AGCGTCAACT TAAAGCTGGT GAGGTTGCGG ATAACGCTGC 
^2^° 6520 6530 6540 6550 

CGCCAAGCCT CTCATCACTA CCCTACTCCC TAAGATGATT GCACGCATCA 
6560 6570 6580 6590 6600 

ACGACTGGTT TGAGGAAGTG AAAGCTAAGC GCGGCAAGCG CCCGACAGCC 
6610 6620 6630 6640 6650 

TTCCAGTTCC TGCAAGAAAT CAAGCCGGAA GCCGTAGCGT ACATCACCAT 
6660 6670 6680 6690 6700 

TAAGACCACT CTGGCTTGCC TAACCAGTGC TGACAATACA ACCGTTCAGG 
6710 6720 6730 6740 6750 

CTGTAGCAAG CGCAATCGGT CGGGCCATTG AGGACGAGGC TCGCTTCGGT 
6760 6770 6780 6790 6800 

CGTATCCGTG ACCTTGAAGC TAAGCACTTC AAGAAAAACG TTGAGGAACA 
6810 6820 6830 6840 6850 

ACTCAACAAG CGCGTAGGGC ACGTCTACAA GAAAGCATTT ATGCAAGTTG 
6860 6870 6880 6890 6900 

TCGAGGCTGA CATGCTCTCT AAGGGTCTAC TCGGTGGCGA GGCG7GGTCT 
5910 6920 6930 6940 6950 

TCGTGGCATA AGGAAGACTC TATTCATGTA GGAGTACGCT GCATCGAGAT 
6960 6970 6980 6990 7000 

GC7CATTGAG TCAACCGGAA TGGTTAGCTT ACACCGCCAA AATGCTGGCG 
7010 7020 7030 7040 7050 

TAGTAGGTCA AGACTCTGAG ACTATCGAAC TCGCACCTGA ATACGCTGAG 
7060 7070 7080 7090 7100 

GCTATCGCAA CCCGTGCAGG TGCGCTGGCT GGCATCTCTG CGATGTTCCA 
7110 7120 : 7130 7140 7150 

ACCTTGCGTA GTTCCTCCTA AGCCGTGGAC TGGCATTACT GGTGGTGGCT 
7160 7170 7180 7190 7200 

ATTGGGCTAA CGGTCGTCGT CCTCTGGCGC TGGTGCGTAC TCACAGTAAG 
7210 7220 7230 7240 7250 

AAAGCACTGA TGCGCTACGA AGACGTTTAC ATGCCTGAGG TGTACAAAGC 
7260 7270 7280 7290 7300 

GATTAACATT GCGCAAAACA CCGCATGGAA AATCAACAAG AAAGTCCTAG 
7310 7320 7330 7340 7350 

CGGTCGCCAA CGTAATCACC AAGTGGAAGC ATTGTCCGGT CGAGGACATC 
7360 7370 7380 7390 7400 

CCT5CGATTG AGCGTGAAGA ACTCCCGATG AAACCGGAAG ACATCGACAT 
7410 7420 7430 7440 7450 

GAATCCTGAG GCTCTCACCG CGTGGAAACG TGCTGCCGCT GCTGTGTACC 
7460 7470 7480 7490 7500 

GCAAGGACAA GGCTCGCAAG TCTCGCCGTA TCAGCCTTGA GTTCATGCTT 
7510 7520 7530 7540' 7550 

GAGCAAGCCA ATAAG7TTGC TAACCATAAG GCCATCTGGT TCCCTTACAA 
7560 7570 7580 7590 7S00 

CA7GGACTGG CGCGG7CG7G 777ACGC7G7 G7CAA7G77C AACCCGCAAG 
7610 7620 7630 7640 7650 
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FIGL1S 9 (continiiei) 



GTAACGATAT 
7660 

GGTAAGGAAG 
7710 

TGTCGATAAG 
7760 



ACGAGAACAT 
7810 
GCTGAGCAAG 
7860 
TGGGGTACAG 
7910 
TTGACGGGTC 
7960 
GAGGTAGGTG 
8010 
CATCTACGGG 
8060 
CAATCAATGG 
8110 
GGTGAAATCT 
8160 
ATGGCTGGCT 
8210 
CGCTGGCTTA 
8260 
GATACCATTC 
8310 
GCCGAATCAG 

83 60 
GC GTGACGGT 

8410 
GCTAAGCTGC 

84 60 
TCGCAAGCGT 

8510 
GGCAGGAATA 

8560 
GGTCAG7TCC 

8610 
TGATGCACAC 

8660 
AAGACGGTAG 

8710 
GGAATCGAAT 

8760 
TGACGCTGCG 

8810 
ATGAGTCTTG 
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TTGCACGAGT 

8910 
CTTGAACCTC 
8960 



GACCAAAGGA 
7670 
GTTACTACTG 
7720 
GTTCCGT7CC 
7770 
CATGGCTTGC 
7820 
ATTCTCCGTT 
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CACCACGGCC 
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TTGCTCTGGC 
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GTCGCGCGGT 
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ATTGTTGCTA 
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GACCGATAAC 
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CTGAGAAAGT 
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TACGGTGTTA 
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CGGGTCCAAA 
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AGCCAGCTAT 
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GCTGCTGGAT 
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GGTAGCTGCG 
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TGGCTGCTGA 
8470 
TGCGCTGTGC 
8520 
CAAGAAGCCT 
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GCTTACAGCC 
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AAACAGGAGT 
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CCACCTTCG7 
8720 
CTTTTGCACT 
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AACCrGTTCA 
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TGATGTACTG 
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CTCAATTGGA 
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CGTGACATCT 
8970 



CTGCTTACGC 
7660 
GCTGAAAATC 
7730 
CTGAGCGCAT 
7780 
GCTAAGTCTC 
7830 
CTGCTTCCTT 
7880 
TGAGCTATAA 
7930 
ATCCAGCACT 
7980 
TAACTTGCTT 
8030 
AGAAAGTCAA 
8080 
GAAGTAGTTA 
8130 
CAAGCTGGGC 
8180 
CTCGCAGTGT 
8230 
GAGTTCGGCT 
8280 
TGATTCCGGC 
8330 
ACATGGCTAA 
8380 
GTTGAAGCAA 
8430 
GGTCAAAGAT 
. 8480 
ATTGGG7AAC 
8530 
ATTCAGACGC 
8580 
TACCAT7AAC 
8630 
C7GG7A7CGC 
8680 
AAGACTG7AG 
8730 
GA77CACGAC 
8780 
AAGCAG7GCG 
8830 
GC7GA777C7 
8880 
CAAAA7GCCA 
8930 
TAGAG7CGGA 
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7GGCGAAAGG 
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CACGG7GCAA 
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CAAG7TCA77 
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CAC7GGAGAA 
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GCG77C7GCT 
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C7GC7CCC77 
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7C7CCGCGA7 
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CCTAG76AAA 
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CGAGA77C7A 
8090 
CCG7GACCGA 
8140 
ACTAAGGCAC 
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GAC7AAGCG7 
8240 
7CCG7CAACA 
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AAGGG7C7GA 
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GCTGA7T7GG 
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TGAACTGGCT 
8440 
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8540 
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FIGURE 9 (continued) 
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